Re: GNU Emacs' rect.el

Tuesday, 3 August 1999

        ...
>>>> "Hrvoje" == Hrvoje Niksic
<hniksic(a)srce.hr&gt; writes: 
    Hrvoje> Yoshiki Hayashi <t90553(a)m.ecc.u-tokyo.ac.jp&gt; writes:
...
> > The question is: which Latin characters?  What are
"normal
> > settings" to you?  Does this mean that when I change my font
> > (as I do, because the default one is awful), the
> > twice-as-wide relationship gets lost? 
I don't see that this was answered.  Yes, in any sensible
interpretation.  The Japanese interpretation is not sensible IMHO, but 
it is historically important, convenient, etc., for Japanese.

...
> Probably Latin-1. I guess Latin-2 is the same. It has nothing
> to do with font setting.  Japanese characters display twice as
> wide and it also takes two columns. 
This is simply a hysterical artifact.

!"#$% column/character-index confusion.  It took me years to figure
this out.  TTYs of course like nice round integer numbers.  It happens
by accident that a 2:1 aspect ratio looks pretty good for Roman and
other alphabetic fonts, while 1:1 is good for ideographic fonts, such
as Japanese kanji.  The Japanese kana (syllabic characters) are
artistically deformed kanji, so they also are 1:1.  Dunno why Korean
Hangul look good in 1:1, but they do.

Since the Japanese need ASCII (128 characters), hiragana (~70
characters) and katakana (~70 characters), it's not possible to fit
all of the alphameric/syllabic characters into 1 byte.  So the
Japanese Industrial Standards have an ASCII + katakana 1 byte
character set (JIS X 0201), and a full several alphabets, both
syllabaries, and the most common ideographs as a 2 byte character set
(JIS X 0208) with a couple of additional ideograph-packed 2 byte sets.

Since the machines restricted to JIS X 0201 are low-powered anyway
(and now obsolete, anyway) it made sense to make the font for JIS X
0201 to be entirely 2:1 aspect ratio (squishing the kana for this
purpose).  (Although it isn't entirely, as JIS X 0201 has several
combining marks.  Since they are represented as separate code points
in the standard the font width/memory size confoundance is maintained, 
though.)

This confoundance of font aspect ratios and size of the character
representation in plain text (strings) has been enshrined in a lot of
Japanese code.

It is purely a historic accident, and I don't see how we can cater to
it (for "backward compatibility") and have a sensible general API at
the same time.

    Hrvoje> Impossible.  Whatever Latin fonts I choose, I see the same
    Hrvoje> Japanese characters.  So if I choose a wider Latin font
    Hrvoje> (which I do, because the defaults are too small), there is
    Hrvoje> no chance in hell that the Japanese glyphs *still* display
    Hrvoje> exactly twice as wide as the Latin ones.

...
> I mean, if the line is only US-ASCII and what-cursor-position
> returns column 40, the line has 40 characters.  If there are
> Japanese characters and what-cursor-position returns column 40,
> there may be only 20 characters. 
Or maybe it returns 40, if you are using the FSF's version of Emacs
and the single-byte/multi-byte flags are set wrong (I think).

    Hrvoje> IMO `what-cursor-position' should either count the
    Hrvoje> characters or maybe the pixels.  Anything else is bound to
    Hrvoje> be broken in almost every situation.  I know the FSF did
    Hrvoje> differently, and I think they made a mistake.

It is probably never broken in Japanese on TTYs.  I'm not sure about
Korean or Chinese; I'm not sure how they handle Roman characters.

Kyle is right; anything that deals with cursor position should be
split into a function that counts characters from the last newline
(when is this useful?) and a function that counts pixels from the
relevant horizontal origin.  With consistent names across functions to
indicate pixels or characters.

    Hrvoje> #ifdef MULE col += XCHARSET_COLUMNS (CHAR_CHARSET (c));
    Hrvoje> #else col ++; #endif /* MULE */

    Hrvoje> IMO the code above is wrong, and the function should
    Hrvoje> simply return 11, not 22.

Yes.  In XEmacs, if you must implement this so that columns line up
nicely on a TTY in Japanese, it should be done in a package.  It is a
hack.  One way to make it not a hack would be to define a device
"Japanese-TTY" which does this stuff for you.

It might be reasonable to maintain this capability for languages like
Japanese when operating on a TTY, but the function should have a third
name indicating that it is a historical compatibility hack.  And you
should be able to specify the font sizes to the functions (in fact
each MULE registry should have an "average-char-width" field which is
a real, rather than the current column-width which is an int).

What will probably happen is that we'll do this, come up with two
separate new names for character and pixel positioning, and leave the
old column-counting names for backward compatibility.

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences       Tel/fax: +81 (298) 53-5091
__________________________________________________________________________
__________________________________________________________________________
What are those two straight lines for?  "Free software rules."

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: GNU Emacs' rect.el