goofy charset selection for Unicode pastes
glynn.clements at virgin.net
Fri Aug 20 20:56:50 EDT 2004
Jamie Zawinski wrote:
> In Mozilla 1.7, view an HTML file containing this source:
> foo — bar
> Select it, and in XEmacs 21.4.15, do this:
> (charsets-in-string (get-selection 'PRIMARY 'UTF8_STRING))
> The result:
> (chinese-cns11643-1 ascii)
> If you insert that string (as with a paste) then you get a *Warnings*
> buffer that says:
> (font/warning) Unable to instantiate font for face default,
> charset chinese-cns11643-1
> So uh, how come I'm getting Chinese fonts for something as simple as
[Disclaimer: what follows is based upon a rather incomplete
understanding of mule-ucs, to say the least.]
The Unicode -> Mule translator searches the list of charsets specified
by unicode-basic-translation-charset-order-list until it finds one
which contains the specified character, i.e. mdash.
The default setting of this variable (in un-define.el) is:
None of the ISO-8859-* family have em-dash (ISO-8859-1 doesn't and,
AFAIK, the rest of them all have essentially the same set of
"punctuation" characters). The first one which *does* have an em-dash
1. By default, XEmacs isn't set up to use the *-iso10646-1 fonts (and
I don't think that it can; if displaying Unicode was as simple as
selecting a Unicode font, I don't think that we'd be using mule-ucs).
2. mule-ucs doesn't understand the windows-125x encodings (and, if it
wasn't for those, I doubt that many people would be using — in
the first place).
3. the choice of charset isn't determined by the presence or absence
of a suitable font; e.g. big5 also has an em-dash, but it comes after
cns11643-1, so having a big5 font won't help.
4. mule-ucs won't try to "approximate" a requested character, i.e. it
won't just give you an ASCII minus sign instead.
> Also, the pasted Chinese character looks exactly like a tilde,
> not a dash at all.
A tilde is the standard representation for an undisplayable character
(see etc/HELLO for lots of tildes and lots of font warnings). If you
had the appropriate Chinese font installed, it would probably look
> Anyone understand this?
Not really; I just have the .el files and too much spare time.
Glynn Clements <glynn.clements at virgin.net>
More information about the XEmacs-Beta