how to add support for more Unicode characters?

Hrvoje Niksic hniksic at xemacs.org
Thu Jun 23 05:29:14 EDT 2005


"Stephen J. Turnbull" <stephen at xemacs.org> writes:

> (1) The standard _requires_ that this be treated as an error
> condition.  The standard says nothing about error recovery; anything
> we do is OK as long as we don't claim that we are writing Unicode
> after encountering such a condition.  What we write may look like
> Unicode, it may stink like Unicode, but we have to label it something
> else (or be nonconformant).

We can simply claim not to implement Unicode semantics on files that
are not conformant to begin with.

> I think this should be implemented as "Somebody told me this file is
> Unicode, but it's not.

That might be acceptable as long as somebody really did tell it, such
as when the file contains a coding cookie, or comes from an
environment where charsets can be clearly marked (MIME part, HTTP
entity, whatever).  Most users who type C-x C-f don't do that with
strict Unicode compliance in their minds.

Now that the Unix (or at least) world has largely moved to the UTF-8
en_US locales, many users are getting stuck with a UTF-8 "environment"
even though they never wanted it.

> (2) For most applications (AUCTeX is not one, Aidan's approach to
> AUCTeX's needs looks reasonable) this is what the user wants, IMHO.

I disagree -- see above.

Another real problem with this feature is that I'm worried the
consequences it might have on the use of the editor.  Say the user
visits a Latin 1 file written in French.  What happens?  What do the
accented characters look like in his buffer?  How will he know that
something went wrong?




More information about the XEmacs-Beta mailing list