Ar an séú lá déag de mí Iúil, scríobh Adrian Aichner:
Do you get a "~" displayed instead?
I *suspect* he’s running a non-Mule binary; I haven’t ever seen the
three-digits symptom, but then I mostly run with Mule.
That's what I get, with the pasted character being described as:
Char: — (U+2014 chinese-cns11643-1 33 55) point=1 of 1(0%) column 0
That's what I get in a fundamental-mode buffer.
That’s correct behaviour, but the Unicode redisplay support on Win32 is
worse than on X11—it requires that every character in the Mule charset be
supported in the Win32 font, for example, which is actually impossible with
some of the CNS 11643 character sets—so it doesn’t appear as such.
On my machine, though, I do have a Big5 font available (I’m not certain, but
I think I downloaded it from Adobe’s East Asian font pack.) And once I
rearrange the character sets that get priority when translating from
Unicode:
(set-language-unicode-precedence-list '(ascii latin-iso8859-1 latin-iso8859-2
latin-iso8859-3 latin-iso8859-4 thai-tis620 greek-iso8859-7 arabic-iso8859-6
hebrew-iso8859-8 katakana-jisx0201 latin-jisx0201 cyrillic-iso8859-5 latin-iso8859-9
latin-iso8859-15 composite control-1 japanese-jisx0208-1978 chinese-gb2312
japanese-jisx0208 korean-ksc5601 japanese-jisx0212 chinese-big5-1 chinese-big5-2
chinese-cns11643-1 chinese-cns11643-2 arabic-digit arabic-1-column arabic-2-column
chinese-sisheng ascii-right-to-left indian-is13194 lao latin-iso8859-14 latin-iso8859-16
ipa vietnamese-viscii-upper vietnamese-viscii-lower chinese-cns11643-3 chinese-cns11643-4
chinese-cns11643-5 chinese-cns11643-6 chinese-cns11643-7 chinese-isoir165 ethiopic
indian-2-column indian-1-column japanese-jisx0213-1 japanese-jisx0213-2 thai-xtis tibetan
tibetan-1-column))
and paste the text again, it displays correctly.
Page
http://www.psychologytoday.com/articles/pto-20070622-000002.xml
asserts it's charset=iso-8859-1:
<meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1">
but then it contains
evolved nature—human nature
which doesn't make sense to me.
HTML entities such as that one can legally refer to any Unicode code point.
The charset=[...] directive just specifies how to interpret the octets.
--
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta