Re: changing the values of iso-8859-* charsets

Friday, 28 October 2005

        Stephen J. Turnbull wrote:

...
>>>>>"Ben" == Ben Wing <ben(a)666.com&gt;
writes:
>>>>>            
>>>>>

    Ben> currently we use 32-127 for the values of the chars in the
    Ben> iso-8859 charsets.  maybe that was needed under old-mule, but
    Ben> in unicode-internal charsets can have values in any arbitrary
    Ben> interval or rectangle in 256 or 256x256 space.  shouldn't we
    Ben> use 160-255?  this would only matter in the output of
    Ben> `split-char'; `make-char' already goes either way.

No.  This is gratuitous incompatibility with ISO 2022, legacy X11 font
indexing, and other Emacsen.  Why buy trouble changing a public API?

 it's the other way around.  the current situation is incompatible with 
the X11 fonts, so we have to hack the values using the bogus `graphic' 
characteristic.

note that in the new world, charsets can have values > 127 in any case.  
cf. big5, shift-jis, etc.

so when i'm creating a new charset like `latin-windows-1252', which is 
compatible with iso-8859-1 but has extra chars in the range 128-159, do 
i do the right thing and have its chars in the range 128-255 be indexed 
as 128-255 (and hence be inconsistent with the `latin-iso8859-1' 
charset), or do i do the wrong thing and move its range down to 0-127?  
and then it appears to have ascii control chars in the range 0-31, but 
they aren't control chars, value 10 is not linefeed, value 13 is not cr, 
etc.?

ben

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: changing the values of iso-8859-* charsets