OK, in summation:
1. C-q is a user-level function and should do whatever makes the most sense.
2. int-char is a low-level primitive and should never depend on high-level
settings like language environment.
3. Everything you can do with int-char can and should be done with make-char
-- representation-independent, much less likelihood of bugs, etc. Therefore
int-char should be removed.
4. Note that CLTL2 also removes int-char.
5. Your statement
In one-byte buffers (either Olivier's 1/2/4 extension or `xemacs
-font
*-iso8859-2') it implicitly will have dependence whatever you say.
is confusing internal and external representations.
ben
"Stephen J. Turnbull" wrote:
> Can somebody give a bunch of examples where using integers as
> characters is useful? For that matter, where they are actually used?
> Ben said "backward compatibility," but I haven't seen this used, and I
> don't really know how to grep for it. I have grepped for int-char,
> int-to-char, char-int, and char-to-int and they're pretty rare in the
> core and package code (2/3 of it) that I have.
>
> The only one that I ever use is the C-q hack for inserting characters
> by code value at the keyboard, and that could arguably (and in
> Japanese invariably is) delegated to an input method which would know
> about language environment (and return a true character).
>
> For iterating over a character set in "natural" order, only ASCII
> satisfies the requirement of having one, and even that's shaky. AFAIK
> the Swedes and the Norwegians, or is it the Danes, disagree on
> ordering the _letters_ in ISO-8859-1 character set. This really
> should be table-driven, and will have to be for everything except
> ASCII and ISO-8859-1 if we go to a Unicode internal representation.
>
> We already have primitives for efficient case conversion and the like.
>
> The only example I can think of offhand where you would really really
> want the facility is to iterate over a code space where you don't know
> which points are legal characters. Eg, to print out tables of fonts.
> Pretty specialized. And this can be done through make-char, anyway.
>
> According to CLtL1, the main portable use for char-int is for hashing.
> But that doesn't square with the kind of usage we've been talking
> about (in loops and the like).
>
> What else am I missing?
>
> Ben's desiderata have some problems.
>
> >>>>> "Ben" == Ben Wing <ben(a)666.com> writes:
>
> Ben> Either int-char should be the mirror opposite of char-int
> Ben> (i.e. accept all legal char integers), or it should be
> Ben> removed entirely.
>
> OK. I agree with this.
>
> Ben> int-char should *never* have any dependence on the language
> Ben> environment.
>
In one-byte buffers (either Olivier's 1/2/4 extension or `xemacs
-font
*-iso8859-2') it implicitly will have dependence whatever you say.
> Even
without Mule, people can always use external encoders to change
> raw ISO-8859-2 to ISO-2022 (not that anybody sane ever would, OK,
> Hrvoje?). Then the two files will be interpreted differently in a
> Latin-1 locale Mule; the ISO-8859-2 file will be recognized as
> ISO-8859-1, and the ISO-2022 file will be internally interpreted as
> ISO-8859-2.
>
> The point is that people normally assume that int-char should accept
> their "natural" integer to character map. For Americans, that's
> ASCII, for Germans, that's ISO-8859-1, for Croatians, that's
> ISO-8859-2. And it works "correctly" in a no-mule XEmacs with `-font
> *-iso8859-2'! Japanese usually use ku-ten or JIS, and there's a
> "natural" map from byte-sized integer pairs to shorts, but it's full
> of holes. So language environments don't agree on what a legal char
> integer is, and where they do (eg, ISO-8859-1 and ISO-8859-2), they
> don't agree on the map. To satisfy your dictum (with which I agree,
> but I take to mean we should get rid of these functions) we can take
> the intersection where they agree
>
> ==> legal char integers == ASCII
>
> which is what I prefer, or pick something arbitrary and efficient
>
> ==> char-int returns the internal representation
>
> which I really hate, or something else. Suggestions?
>
> Ben> I don't think C-q should either. If Hrvoje wants to insert
> Ben> Latin-2 characters by number, then make C-u C-q work so that
> Ben> it also prompts for a character set, with a default chosen
> Ben> from the language environment.
>
> And restrict this to ASCII? Or assume Latin-1 in GR if there is no
> prefix argument?
>
> This is a useful feature. C-q currently inserts Latin-2 characters
> for Hrvoje in no-mule XEmacs (stretching the point only a little); I
> think it should continue to do so in Mule. This really is an input
> method issue, not a keyboard issue. In XEmacs, inserting an integer
> into a buffer has no meaning. Users insert characters. So this is a
> completely different issue from the programming API, and should not be
> considered analogous.
>
> Maybe we could have C-q insert according to the Unicode standard, and
> treat C-u C-q as part of the input method. But I think most users
> would prefer to have C-q insert according to their locale-standard
> tables, and select Unicode explicitly using the C-u C-q idiom. In
> fact (again this points to the input method idea), Japanese users
> would probably like to have the alternatives of using kuten (pairs
> from 1--94 x 1--94) or JIS (pairs from 0x21--0x7E x 0x21--0x7E) as
> options since both indexing systems are common in tables.
>
> --
> University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
> Institute of Policy and Planning Sciences Tel/fax: +81 (298) 53-5091
> __________________________________________________________________________
> __________________________________________________________________________
> What are those two straight lines for? "Free software rules."
--
ben
--
In order to save my hands, I am cutting back on my responses, especially to
XEmacs-related mail. You
_will_ get a response, but please be patient. If you need an immediate
response and itÂ’s not apparent in
your message, please say so. Thanks for your understanding.