Behavior of incf with characters?

Stephen J. Turnbull stephen at xemacs.org
Fri Mar 19 06:06:59 EDT 2010


Alan Mackenzie writes:

 > In the past, I had to loop over the ASCII set,

And I'm sure you came up with the obvious solution:

    (for i from 0 to 127 do
      (let ((ch (int-to-char i)))
        (when (usable-p ch)
          (frob ch))))

AFAIK that should work in GNU Emacs as well, and I really don't see
what's so awful about it.  Looping over a character set really only
makes sense if you *know* you're working with ASCII or (maybe) ISO
8859-1.  And the above generalizes (sort of) to Mule charsets:

    (for i from 33 to 127 do
      (let ((ch (make-char charset i)))
        (when (usable-p ch)
          (frob ch))))

The "sort of" is due to multibyte charsets, among other issues.

 > In the end, I wrote the unlovely macro `c-int-to-char' to code
 > round XEmacs's inability to operate on a character _set_ (in
 > particular, looping through it).

AFAIK, GNU Emacs can't do that in any generality either.  Only in the
very exceptional case of ISO 8859/1 and its ASCII subset, or the full
Unicode character set.  For example, how would you loop over ISO
8859/15 (aka Latin-9) in post-unicode-merge GNU Emacs?

    (for ch from ?\x00 to ?\x20AC do
      (when (usable-p ch)
        (frob ch))))

doesn't look terribly attractive to me.

 > IWBN if `incf', `1+', etc. did the Right Thing.  IMAO, that would mean
 > NOT skipping over any gaps in character sets.

But what's a character set?  There are a minimum of three notions of
character sets relevant in some application or other: Mule "charsets",
the Unicode character set, and the key sets of char-tables.  They will
present three different orders for the characters, and three different
notions of "gap" to deal with explicitly in your code.

 > In GNU Emacs, of course, there was no problem as characters are
 > numbers there.

Sure, but for how long?  Last I heard even Ken'ichi was looking
forward to a separate character type.

It's possible that there are good answers to all the questions above.
I spent several hours on it a decade ago, and figured out that making
a personal policy of looping over code points and converting them to
characters as appropriate would make my head stop hurting.  I made the
decision, and my head stopped hurting right away. :-)



More information about the XEmacs-Beta mailing list