Re: [BUG] with non-ascii characters in dir names

Tuesday, 24 November 1998

        Jan Vroonhof writes:

Jan> Didier Verna <verna(a)inf.enst.fr&gt; writes: Note that the same change as
Jan> needed to fix your bug needs to apply to the the NT version as well.

	Yup. And to the `program' variable too, and ...

Jan> well. Note that top that file claims the function has been MULE-ized.

	I think that some mule related warnings in this file (and probably
some others) were hosed when Kirill split the process code. When I wrote the
multicast support, I remeber that several functions were explicitely marked to 
be non mule-ized. More than there is now.

...
> You're right, this corrects the bug. This brings me a
question that might
> be stupid, but isn't it likely that most use of XSTRING_DATA through the
> code will be bogus in the same fashion[1]? 
Jan> Probably. MULE is totally broken this regard. I looked at a few of them
Jan> and most of them indeed look bogus.

Jan> bolzano:vroonhof/cvs/xemacs-20/src> grep XSTRING_DATA *.c | wc -l 424

Jan> :-(

	I know exactly what you mean. I've done this too.

Jan> I am wondering. Suppose we were to redefine XSTRING_DATA such that it
Jan> always used 'raw-text' or maybe even 'binary' as the coding system
would
Jan> that help?

	Dunno. Maybe the scheme used in convert_to_external_format when the
coding system is nil would be better. But there's a deeper problem here I
think (see below).

...
> about the coding system ? I mean take the same function,
> unix_create_process, where the environement is massively copied for the
> child to be forked. What should we do about it ? 
Jan> You mean the arguments etc.. Do we really want
Jan> 'process-argument-coding-system' (this can also be a lisp list op coding
Jan> systems such that the n-th item is used as a coding sytem for the n-th
Jan> argument) en 'environment-variable-coding-system',
Jan> domain-or-host-name-coding-system.

	This smells too much like we're gonna get tons of similar variables
each time we encounter a new problematic context. Actually, I think that
there's a real problem when we don't know a priori which is the proper coding
system to use, or even if there's one. Should default to raw, should we assume 
ascii and replace bogus chars with tildas, should we try to clever and guess
the coding system ... ?

	Damn, if only the whole world were unicode :-(

-- 
    /     /   _   _       Didier Verna        http://www.inf.enst.fr/~verna/
 - / / - / / /_/ /      E.N.S.T. INF C201.1      mailto:verna＠inf.enst.fr
/_/ / /_/ / /__ /        46 rue Barrault        Tel.   (33) 01 45 81 73 46
                      75634 Paris  cedex 13     Fax.   (33) 01 45 81 31 19

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998