I'm a bit confused now - I was under impression that xemacs 21.5 uses
Unicode internally. That is why I have been following it. The characters
that do get whacked are from different parts of Unicode - circled Kanas,
some Hanguls (some pass fine), some Devanagari - it just seems strange
that these characters cannot be represented.
If this is not a possibility, a warning/error should definitely be
issued, because silent changing of the file is really bad.
As far as looking into other editors, xemacs works great for me - I
don't intend to change it. The problem of a good Unicode editor is very
much present. It would be great if xemacs could fill that space :)
Anyway - thanks for your reply!
Regards,
v.
Stephen J. Turnbull wrote:
>>>>>"Vladimir" == Vladimir Weinstein
<vweinste(a)earthlink.net> writes:
>>>>>
>>>>>
Vladimir> When you load an UTF-8 file in xemacs, some characters
Vladimir> do not get displayed - instead, a GETA MARK (U+3013) is
Vladimir> displayed (looks like this 〓). This is generally ok for
Vladimir> display purposes, but the problem is that when I save
Vladimir> that file, all the characters that cannot be displayed
Vladimir> get replaced by the GETA MARK, which is unacceptable.
Sorry, you'll have to create a private charset to hold the characters
you need, or get another editor. The GETA MARK (which is an older
convention for U+FFFD REPLACEMENT CHARACTER, from the Japanese JIS X
0208 standard) means that XEmacs can't represent those characters
internally with the charsets currently available. Replacing the
current Mule internal representation with Unicode is planned, but not
going to happen soon.
We should signal an error, though, or query the user about how to
handle those characters.
GNU Emacs may work for your purposes, it has a somewhat extended
repertoire of characters, although I think that their released
versions do not cover all of Unicode yet.