"Stephen J. Turnbull" <stephen(a)xemacs.org> writes:
Aidan Kehoe writes:
> David Kastrup’s use case in
> that keeping the invalid sequences around in some form is
> preferable to not doing so.
Heh. That's *exactly* the use case for Lisp codecs. Let David be
David, and let me not be bothered. ;-)
Note that in the end David didn't want invalid sequences preserved; he
wanted them ignored. (That's exactly the use-case that inspired the
definition of DTRT used above; I'm sure it's not always the one that
would be desired.)
Wherever you got your inspiration from, David most certainly did _not_
want invalid sequences ignored. That would have been quite useless.
there's no problem, of course. If it's not valid UTF-8, then
should be warned, loudly. BTW, under what circumstances do you end up
trashing the Latin-1? A quick test with (nearly current) 21.5 shows
that Latin-1 read as UTF-8, edited in the ASCII portion, and saved as
UTF-8, seems to preserve the expected text intact.
Which is the right thing, I should say. Or, if you want to, a way to
deal with a broken thing without disturbing the pieces, so that they can
be sucked up and reassembled when turning back the clock.
David Kastrup, Kriemhildstr. 15, 44793 Bochum
XEmacs-Beta mailing list