how to add support for more Unicode characters?
hniksic at xemacs.org
Thu Jun 23 05:29:14 EDT 2005
"Stephen J. Turnbull" <stephen at xemacs.org> writes:
> (1) The standard _requires_ that this be treated as an error
> condition. The standard says nothing about error recovery; anything
> we do is OK as long as we don't claim that we are writing Unicode
> after encountering such a condition. What we write may look like
> Unicode, it may stink like Unicode, but we have to label it something
> else (or be nonconformant).
We can simply claim not to implement Unicode semantics on files that
are not conformant to begin with.
> I think this should be implemented as "Somebody told me this file is
> Unicode, but it's not.
That might be acceptable as long as somebody really did tell it, such
as when the file contains a coding cookie, or comes from an
environment where charsets can be clearly marked (MIME part, HTTP
entity, whatever). Most users who type C-x C-f don't do that with
strict Unicode compliance in their minds.
Now that the Unix (or at least) world has largely moved to the UTF-8
en_US locales, many users are getting stuck with a UTF-8 "environment"
even though they never wanted it.
> (2) For most applications (AUCTeX is not one, Aidan's approach to
> AUCTeX's needs looks reasonable) this is what the user wants, IMHO.
I disagree -- see above.
Another real problem with this feature is that I'm worried the
consequences it might have on the use of the editor. Say the user
visits a Latin 1 file written in French. What happens? What do the
accented characters look like in his buffer? How will he know that
something went wrong?
More information about the XEmacs-Beta