>>>> "Ben" == Ben Wing <ben(a)666.com>
writes:
Ben> that way i'll have a reasonable chance of answering your ?'s
Ben> before my hands fall off.
Sorry, you've been right all along, and I'm wrong (surprise,
surprise). I'm still glad I asked (I wouldn't have figured it out on
my own for a long time), although I'm sorry about wasting your time
with my misunderstanding.
My excuse is that I've been living in buffer.h and mule-*.c for the
last little while, which has given me a warped view of what "working
with internally-formatted data" is about.
I still have some comments and questions, hopefully they're more
sensible now.
(0) I suggest changing the description of the interface from "for
working with internally-formatted data" to "for working with
internally-formatted data in external contexts" or something like
that, and emphasizing feature (e) "it provides easy operations to
convert to/from externally-formatted data". This is redundant if
you already know what the interface is for, but more specific for
the completely-new-to-internals person. And it probably would
have kept me from going off half-cocked. :-)
(1) I still think it is harmless, at least for literals, and possibly
useful to allow arbitrary bytes in ei{cat,cpy}_c().
(2) Is the usage (eiref (filename, 0) == '.') from
mswindows_get_files() really correct? I've convinced myself that
_this_ case works for all internal representations so far proposed,
because they are all extensions of ASCII, and the comparison works
after the char on the rhs is promoted to Emchar. But this really
does need to be restricted to ASCII. C0 controls are OK. But
Latin-1 would break for UTF-8 default internal, eg.
Possibly an extension of the current APIs to include say
eirefcmp_* (eistring, eiindex, character)
is called for? But I can't think of an example offhand where the
general case would be useful.
(3) We may want to be a little bit careful with the notion of the
default internal representation. I can see that a default
internal representation of UCS-2 (UTF-16, I presume is what you
really mean?) would be attractive. So what happens if you have
data that is not representable in the default internal
representation? Do we just tell those users to get lost?
It would be kind of weird if the default internal representation
that Eistrings dealt with was UCS-2 but UTF-8 representation was
available in buffers, which you don't rule out.
--
University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences Tel/fax: +81 (298) 53-5091
_________________ _________________ _________________ _________________
What are those straight lines for? "XEmacs rules."