Re: bug with start-process and start-process-shell-command

Monday, 17 January 2005

        "Stephen J. Turnbull" <stephen(a)xemacs.org&gt; さんは書きました:

...
>>>>> "Mike" == Mike FABIAN
<mfabian(a)suse.de&gt; writes:

     Mike> I think XEmacs should set useful defaults depending on the
     Mike> locale used in the system when XEmacs was started.

     Mike> I don't want that

     Mike>     (set-language-environment "Japanese")

     Mike> sets ja_JP.eucJP locale again.

 We'll have to have Ben explain that.  I have no idea what the
 rationale for that is.  Given that most of that function seems to be
 intended to deal with Windows variance, possibly the whole thing is
 wrong-headed for Unix. 
I'm sure this must be wrong. The LANG and the LC_* variables
should not be changed by XEmacs, just used as they are.

...
     Mike> I have only LANG set, I have neither set any LC_*
variables
     Mike> nor LC_ALL, I had set these as well I probably would have to
     Mike> remember and restore them as well.

 No, LANG is the only variable touched by that code AFAICT.  LC_*
 should be safe from it. 
OK. But that means that when starting xemacs for example like this

     LANG=ja_JP.UTF-8 LC_COLLATE=ja_JP.UTF-8 LC_PAPER=en_US.UTF-8 xemacs

you would get a mixture of encodings in the LC_* variables after
XEmacs changed LANG, like this:

    mfabian＠magellan:~$ LANG=ja_JP.eucJP LC_COLLATE=ja_JP.UTF-8 LC_PAPER=en_US.UTF-8
locale
    LANG=ja_JP.eucJP
    LC_CTYPE="ja_JP.eucJP"
    LC_NUMERIC="ja_JP.eucJP"
    LC_TIME="ja_JP.eucJP"
    LC_COLLATE=ja_JP.UTF-8
    LC_MONETARY="ja_JP.eucJP"
    LC_MESSAGES="ja_JP.eucJP"
    LC_PAPER=en_US.UTF-8
    LC_NAME="ja_JP.eucJP"
    LC_ADDRESS="ja_JP.eucJP"
    LC_TELEPHONE="ja_JP.eucJP"
    LC_MEASUREMENT="ja_JP.eucJP"
    LC_IDENTIFICATION="ja_JP.eucJP"
    LC_ALL=
    mfabian＠magellan:~$ 

Such mixtures are not allowed. ``The Open Group Base Specifications
Issue 6'' says about this:

    If different character sets are used by the locale categories, the
    results achieved by an application utilizing these categories are
    undefined.

(see http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap07.html)

In the worst case, applications may even crash.

I remember a bug report concerning "sort" where "sort" aborted when
LC_TIME and LC_CTYPE were set to locales with different encodings:

(SuSE Bugzilla http://bugzilla.suse.de/show_bug.cgi?id=26506)

When starting sort like this,

    1. unset all LC_* variables. Unset LANG (or set to POSIX) 
    2. echo -e "l4\nl3" | LC_TIME=de_DE＠euro  LC_CTYPE=en_US.UTF-8 sort 

if failed with a message like

    sort: sort.c:717: inittables_mb: Assertion `mblength != (size_t)-1 && mblength

    != (size_t)-2' failed. 
    Aborted 

This was a bug in sort which was fixed by Mitsuru Chinen:

Chinen>  Additional Comment #8 From Mitsuru Chinen  2003-07-16 05:07
Chinen> 
Chinen> Hello all,
Chinen> 
Chinen> sort utility stores the multibyte character strings of months
Chinen> into buffers at first. This behavior is for `-M' option
Chinen> (i.e. compare acccording to month.) At that time, the month
Chinen> multibyte character stirngs are converted into a wide
Chinen> character string in order to ignore the different of uppercase
Chinen> and lowercase.
Chinen> 
Chinen> When LC_TIME=de_DE＠euro and LC_CTYPE=en_US.UTF-8, the
Chinen> multibyte character strings of month are not able to be
Chinen> converted into wide character.  The reason why those strings
Chinen> are not is they have different encoding character from
Chinen> LC_CTYPE.
Chinen> 
Chinen> I'll make a patch not to initialize the string of month when
Chinen> `-M' option is not specified. But I will not support the case
Chinen> where `-M' option is specified and the value of LC_TIME is
Chinen> different from the one of LC_CTYPE. I think supporting such a
Chinen> case is difficult by the above-mentioned reason.  (The Single
Chinen> Unix specification also says the following:

→ see ``The Open Group Base Specifications Issue 6'' above.

-- 
Mike FABIAN   <mfabian(a)suse.de&gt;   http://www.suse.de/~mfabian
睡眠不足はいい仕事の敵だ。

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: bug with start-process and start-process-shell-command