Re: #'query-coding-region and invalid Unicode sequences.

Saturday, 17 January 2009

        "Stephen J. Turnbull" <stephen(a)xemacs.org&gt; writes:

...
 Aidan Kehoe writes:

  > David Kastrup’s use case in
  > http://mid.gmane.org/85fyvc3efj.fsf＠lola.goethe.zz convinced me
  > that keeping the invalid sequences around in some form is
  > preferable to not doing so.

 Heh.  That's *exactly* the use case for Lisp codecs.  Let David be
 David, and let me not be bothered. ;-)

 Note that in the end David didn't want invalid sequences preserved; he
 wanted them ignored.  (That's exactly the use-case that inspired the
 definition of DTRT used above; I'm sure it's not always the one that
 would be desired.) 
Wherever you got your inspiration from, David most certainly did _not_
want invalid sequences ignored.  That would have been quite useless.

...
 there's no problem, of course.  If it's not valid UTF-8, then
the user
 should be warned, loudly.  BTW, under what circumstances do you end up
 trashing the Latin-1?  A quick test with (nearly current) 21.5 shows
 that Latin-1 read as UTF-8, edited in the ASCII portion, and saved as
 UTF-8, seems to preserve the expected text intact. 
Which is the right thing, I should say.  Or, if you want to, a way to
deal with a broken thing without disturbing the pieces, so that they can
be sucked up and reassembled when turning back the clock.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: #'query-coding-region and invalid Unicode sequences.