Re: XEmacs doesn't support windows-1252

Thursday, 26 July 2007

 Ar an séú lá is fiche de mí Iúil, scríobh Mike FABIAN: 

...
 And when I check this character in an e-mail I received which
 uses charset=windows-1252, I get

     (split-char ?”)
     (chinese-gb2312 33 49)

 and the character is displayed in a very ugly Chinese font which looks
 double width even in the X-frame.  In the terminal, XEmacs again
 treats the character as double width but the terminal treats it as
 single width which causes problems.

     (split-char (string-to-char
       (decode-coding-string (format "%c%c%c" #xe2 #x80 #x9d) 'utf-8)))
     (latin-iso8859-16 53)

 Is it possible to get the sane mapping to a European [character set]
 *always*? 
Sure. You can structure make-8-bit-coding-system call like so:

(make-8-bit-coding-system
 'windows-1252
 (list 
   (list #x80 (decode-char 'ucs #x20AC)) ;; EURO SIGN
   (list #x82 (decode-char 'ucs #x201A)) ;; SINGLE LOW-9 QUOTATION MARK
[...]
   (list #x9F (decode-char 'ucs #x0178)));; LATIN CAPITAL LETTER Y WITH DIAERESIS
 "Microsoft's Code Page 1252, for Western Europe and the Americas."
 '(mnemonic "cp1252"
   documentation
   "This is an extension of ISO 8859-1 that provides the Euro sign and
several punctuation marks not otherwise available in ISO 8859 1. It is
incompatible with ISO 2022, which is not a problem in the regions
where it is used.  "
   aliases (cp1252)))

and have the octets on disk always represented in the jit-ucs-charset-?
XEmacs character set (because the evaluation then takes place at dump time,
when the other Mule character sets are not available.) This has the
disadvantage that it’s against the spirit of ISO 2022 (when it comes to our
auto-save files) and that for the ISO 8859-1 characters, case information is
not preserved (to my surprise; that’s something I need to look into.)

...
 Footnotes: 
 ¹ I type this character using an entry in my ~/.Xmodmap, no matter
 whether I am typing into an X-frame or into a terminal frame. I.e. I type
 it the same way but the results are different. 
Right. Because in the terminal it depends on the current Unicode -> Mule
mapping, while under X11 it depends on the Unicode -> Mule mapping at the
time when XEmacs appeared on the relevant display, which normally is before
~/.xemacs/init.el was evaluated.

I had planned to make available an XEmacs with Unicode as an internal
encoding (and I have most of the necessary work for that on my disk), but I
don’t think that is maintainable in the presence of both the Mule and the
non-Mule compilation options (which means, in the presence of Unicode
internally, four ./configure options with which everything needs to be
tested when anyone makes any change that’s relevant to character
encoding). And the elimination of the non-Mule compilation options has been
vetoed, so it looks like users are stuck with the internal Mule encoding.

-- 
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)

_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998