Re: [PATCH] Have VM figure out what MIME character set to use by coverage

Saturday, 26 March 2005

 Ar an seachtú lá is fiche de mí Márta, scríobh Hrvoje Nikšić: 

...
 "Stephen J. Turnbull" <stephen(a)xemacs.org&gt; writes:

 >>>>>> "Hrvoje" == Hrvoje Niksic <hniksic(a)xemacs.org&gt;
writes:
 >
 >     Hrvoje> It is my understanding that "un-define" and
"unicode" are
 >     Hrvoje> not merely deprecated, but that they have always been
 >     Hrvoje> unfinished, buggy, and ultimately unsupported.  I could be
 >     Hrvoje> mistaken.
 >
 > This is true of Mule-UCS, it's not true of Unicode support, at least
 > on Windows, in 21.5.

 Why "at least on Windows"?  It would be a shame if the 21.5 Unicode
 support were somehow tied to Windows.  If anything, it would leave out
 all the Unix users -- a vast majority. 
It’s not, don’t worry. What Unicode support exists in 21.5 is
cross-platform, but it’s also incomplete, in that it trashes data if Unicode
codepoints are encountered that don’t have mappings to the extant Mule
character sets. 

...
 >     >> > Charming.  Why do we ever fall back to
"iso-2022-jp" for
 >     >> > things we send over the wire?  Signalling an error would be
 >     >> > kinder to the user and her correspondents.
 >
 > Since when do we "fall back" to iso-2022-jp?

 I don't know, but that's what the code comment seems to indicate. 
VM always has. Gnus signalled an error. 

...
 > It's a misreading of the correct statement that ISO 8859
defines
 > versions of ISO 2022 to permit use of facilities like designation of
 > additional charsets.  ISO 8859 doesn't permit that.
 [...]
 > What we do have is latin-unity, but you don't like that either.

 latin-unity pretty much broke my XEmacs when I tried it.  For example,
 byte-compiling a file prompted me with questions I didn't know how to
 respond to.  (There were other examples of similar breakage.)  Perhaps
 those bugs have been fixed in the meantime, I don't know.

 BTW in the quoted text you seem to be implying that latin-unity
 somehow implements a more correct ISO 8859, e.g. allowing me to read
 files as Latin 2 without interpreting ISO 2022 sequences.  Did I
 misread it?  I wasn't aware that latin-unity did anything of the sort. 
I didn’t understand that paragraph _at all_, in Stephen’s mail, so I would
welcome a clarification, too. 

And, no, we don’t have ISO 8859 coding systems that avoid falling back to
ISO 2022 escape sequences when they encounter characters they can’t
encode. The FSF have implemented them, and they should be reasonably
portable, should you be short of things to do. 

...
 >     Hrvoje> My point is that we should strive to remove such
code, not
 >     Hrvoje> add more of it.  YMMV.
 >
 > I agree, but not at the expense of corrupting user data.

 I don't know what corruption you're referring to, but I'll take your
 word for it. 
The corruption he’s referring to is that you see, for example, when you do

  (encode-coding-string  (make-char 'greek-iso8859-7 34) 'iso-8859-1)

--any character that’s not understood by the coding system being used maps
to tilde, which loses data. What happens when you do

  (encode-coding-string  (make-char 'greek-iso8859-7 34) 'iso-8859-2)

is _slightly_ preferable to that, though not very. 

-- 
“I, for instance, am gung-ho about open source because my family is being
held hostage in Rob Malda’s basement. But who fact-checks me, or Enderle,
when we say something in public? No-one!” -- Danny O’Brien

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003