Re: package-get & md5

Sunday, 16 May 1999

        [ Added Cc to xemacs-mule ]

Jan Vroonhof <vroonhof(a)math.ethz.ch&gt; writes:

...
 However I still think there is a problem with file coding
 there. This code presumably worked in latin-1 and Japanese
 environments. Given that he also has a problem writing the file out
 again I think this illustrates a general problem. IMHO, All coding
 systems should be such that when an arbitrary binary file is read in
 with them and written back out the things should be the same. 
Even I admit this is impossible.  I say "even I" because I've had
many requirements along similar lines, but this one really can't be
met in the current framework.

A little background: when we "read in" a file, we always convert it to 
the internal representation.  That's what the coding systems are for.
All the other manipulations are performed with the internal data.
Now, when we "write out" the file, we convert it to an external
representation, which might or might not be the same as the one we
read the file in was.

Even when reading and writing representations are the same, you can
lose data.  As Stephen J. Turnbull illustrated: imagine that we are
reading a binary file with ISO-2022 and that it contains binary
sequences

<Ltn2><Ltn1><Ltn2>some-chars...

The switch from Latin 2 to Latin 1 and back to Latin 2 will be lost on 
the input because it will have been "optimized away".  And these
switching sequences are by no means rare.  :-(

If you consider that all of out Latin N coding systems are in fact
ISO-2022 with the appropriate default for 160-255 range, you see why
Mule is in trouble.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: package-get & md5