Re: Characters lost when saving UTF-8 file on win32 (21.5.16)

Friday, 10 October 2003

        I'm a bit confused now - I was under impression that xemacs 21.5 uses
Unicode internally. That is why I have been following it. The characters
that do get whacked are from different parts of Unicode - circled Kanas,
some Hanguls (some pass fine), some Devanagari - it just seems strange
that these characters cannot be represented.

If this is not a possibility, a warning/error should definitely be
issued, because silent changing of the file is really bad.

As far as looking into other editors, xemacs works great for me - I
don't intend to change it. The problem of a good Unicode editor is very
much present. It would be great if xemacs could fill that space :)

Anyway - thanks for your reply!

Regards,
v.

Stephen J. Turnbull wrote:

...
>>>>>"Vladimir" == Vladimir Weinstein
<vweinste(a)earthlink.net&gt; writes:
>>>>>            
>>>>>

    Vladimir> When you load an UTF-8 file in xemacs, some characters
    Vladimir> do not get displayed - instead, a GETA MARK (U+3013) is
    Vladimir> displayed (looks like this 〓). This is generally ok for
    Vladimir> display purposes, but the problem is that when I save
    Vladimir> that file, all the characters that cannot be displayed
    Vladimir> get replaced by the GETA MARK, which is unacceptable.

Sorry, you'll have to create a private charset to hold the characters
you need, or get another editor.  The GETA MARK (which is an older
convention for U+FFFD REPLACEMENT CHARACTER, from the Japanese JIS X
0208 standard) means that XEmacs can't represent those characters
internally with the charsets currently available.  Replacing the
current Mule internal representation with Unicode is planned, but not
going to happen soon.

We should signal an error, though, or query the user about how to
handle those characters.

GNU Emacs may work for your purposes, it has a somewhat extended
repertoire of characters, although I think that their released
versions do not cover all of Unicode yet.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: Characters lost when saving UTF-8 file on win32 (21.5.16)