Ar an seachtú lá déag de mí Iúil, scríobh Julian Bradfield:
I've recently had all the non-UTF8 non-ASCII mail in my folders
corrupted,
irrecoverably so (short of searching through many days' backups, which
I can't do myself). The cause of the corruption is bugs in VM, exposed
by my switching all my coding system defaults to utf-8.
All my relevant coding systems have been switched to UTF-8; I don’t see any
such bugs with the latest VM. With 21.5’s implmentation of Unicode, that is.
Have you reported the ones you see? (I haven’t searched Usenet or the
mailing lists.)
The reason it's irrecoverable is the putrid pile of dingos'
kidneys that
is mule-ucs, and in particular the way it does no validity checking at
all when it decodes alleged utf-8 (rather than copying the invalid bytes
into the buffer as Latin1, as the ISO2022, SJIS and Big5 methods do).
Great, looks like we should make a release and deprecate 21.4, to prevent
people from thinking Mule-UCS a reasonable excuse for software.
[...] On that topic, it's a sad truth that that PRC-locale
software
(especially that made by Microsoft) advertises text as GB2312 when in
fact it's GBK or even GB18030. This is just too big a fact to ignore. So
what I would like to do is arrange that my "gb2312" coding system
actually decodes GB18030 on read, but correctly only puts out real GB2312
on write. I can't see any easy way to arrange this in Lisp. Is there one?
No way from Lisp, but it should be reasonably workable from C. Thanks for
reminding us of this.
--
¿Dónde estará ahora mi sobrino Yoghurtu Nghe, que tuvo que huir
precipitadamente de la aldea por culpa de la escasez de rinocerontes?
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta