Re: RFC: (set-xkb-cyrillic-charset "koi8-r") for non-Mule XEmacs

Wednesday, 9 May 2001

        ...
>>>> "SJT" == Stephen J Turnbull
<turnbull(a)sk.tsukuba.ac.jp&gt; writes: 
Alexey> In this case first clause should be (it should modify
Alexey> cyrillic_koi8_r[] that is stored somewhere in some hashtable
Alexey> and used by x_keysym_to_character():

SJT> That's what I wanted to avoid.  OK, I see that it's probably
SJT> unavoidable in no-mule.  And it's probably right to do it inside
SJT> that switch (this automatically makes it configurable according
SJT> to character set, 

SJT> so that the iso-to-koi translation does NOT get
SJT> applied in the Latin-2 buffer, which would really hinder a
SJT> Polish/Russian translator).

Just for clarity: there is no "iso-to-koi" translation here and no
concept of "Latin-2 buffer".  I speak of non-Mule, and there is only
one-to-one translation between internal representation (single byte),
font encoding and external representation (bytes in files).  It is
very common kind of setup AFAIK.

We just need to implement various translations between Cyrillic_XXX
and those one-byte encodings.

SJT> As far as I know, however,
SJT> it's not Mule XEmacs that wants the to fonts match internal
SJT> representation.  

It does not want.  It just doesn't go into trouble of doing something
like Mule's (set-charset-ccl-program). 

SJT> It's that XEmacs developers have never done
SJT> anything about using the facilities provided by Mule to handle
SJT> fonts that have different encoding from internal.  

Huh?  What about ccl-programs?  

Alexey> I think they should be predefined in stock XEmacs w/o any Lisp
Alexey> definitions (in style of cyrillic[] array that currently
Alexey> exists).

SJT> OK, I agree about the need for predefinition, especially in
SJT> no-Mule.  However, in practice, we don't know which fonts are
SJT> important.  

There are two most important encodings to support for non-Mule: koi8-r
and iso8859-5 (for compatibility with default).  Then there is more
seldom used cp1251 (for compatibility with Windows).  Finally there is
cp866 that needs to be implemented mostly for completeness. 

SJT> So there should be a Lisp-level way to create and
SJT> install such arrays.  If you hack only for Cyrillic, you will
SJT> create a Russianized XEmacs.  

Why ever?  Cyrillic != Russian.  As far as encodings allow, I'll
surely try to support every Cyrillic character.  E. g. koi8-r just
does not include NJE, KJE, DZHE, TSHE and friends, while cp1251 and
8859-5 does include those.

SJT> And will you put in (eg) the extra
SJT> characters the Ukrainians want?

Ukrainians have some variance of koi8-r called something like koi8-u.
I'm sure someone from Ukraine will implement this encoding when
there'll be general support for that and Ukrainian users will do

  (x-set-keysym-translation-subset 'cyrillic 'koi8-u)

(See interface definition below). Nothing prevents us from doing that.

SJT> I disagree strongly about implementing such arrays for Mule,
SJT> however.  

Mule will use only the same array it is using currently (it will be
renamed to cyrillic_iso8859_5[] probably, to reduce confusion).

SJT> As for predefinition in C vs Lisp: We try to think in terms of
SJT> implementing _everything_ in the text editor in Lisp, for
SJT> robustness and flexibility.  We move down to the C level when we
SJT> want the efficiency.  We don't _need_ it here; users will never
SJT> see the delay in installing such a table from Lisp rather than C,
SJT> since it only happens once per session.  We do _need_ the Lisp
SJT> level definition, since it's probable that Greeks, at least, will
SJT> need this too.  And we don't know who else!  We get robust
SJT> predefinition from Lisp by "dumping" the Lisp into the
SJT> executable, where we always know where to find it.

I just want to switch whole arrays, not single keysyms.  I think that
interface will be something like that:

  ;; define new Cyrillic encoding
  (x-define-keysym-translation-subset 'cyrillic 'koi8-r)
  (x-set-keysym-translation 'koi8-r 'Cyrillic_HARDSIGN ?\xFF)
  ... repeat for every Cyrillic character

  ;; this is done by user interactively or in ~/.emacs
  (x-set-keysym-translation-subset 'cyrillic 'koi8-r)

Most popular encodings will be predefined in src/event-Xt.c, but still
modifiable via (x-set-keysym-translation), surely (though this will
probably be used only on newly defined keysym translation subsets).

If there are no objections or additional comments, I'll start coding
that tomorrow.

--alexm

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: RFC: (set-xkb-cyrillic-charset "koi8-r") for non-Mule XEmacs