Ar an deichiú lá de mí Deireadh Fómhair, scríobh Michael Sperber:
I don't even get that far. Is there a brief overview somewhere
on how
locale influences the various parts of XEmacs? Specifically, how do
file names work? I have
(get-coding-system-from-locale (current-locale)) => utf-8
Yet XEmacs treats a filename called "äää" (a umlaut, a umlaut, a umlaut)
as if it were called "a box a box a box" (i.e. apparently the UTF-8
encoding of the file name).
OS X (which your headers tell me you’re on) uses something close to UTF-8 in
normal form D--see
http://developer.apple.com/qa/qa2001/qa1173.html for
details--for file name encoding, where the canonical form for a precomposed
character is the base character followed by a combining character.
If you do C-x = on the box, it should tell you that it represents U+0308
COMBINING DIAERESIS; that it is not shown as a combined diaeresis is
redisplay problem with XEmacs.
The most common encoding of "ä" in UTF-8 is the representation of U+00E4,
"\xc3\xa4". XEmacs handles that fine, which is why you won’t see this
problem on Linux, where as a rule normalisation is not done. U+0061 followed
by U+0308 (what you’re seeing: "a\xcc\x88" on disk) is preserved on reading
and writing, which is good, and the OS X libraries automatically normalise
when presented with non-normalised UTF-8 text, which is also good.
So this is not a bug in the interaction of the locale with file names,
rather in our handling of combining characters. Actual support for
normalisation in the Unicode coding systems would also be nice to have, but
is not amazingly relevant for this particular problem, since the OS does it
already.
--
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta