I couldn't read it easily, so I reformatted it.  I'll comment in a
separate mail, as soon as I find the time.
  OG.
On Mon, Apr 17, 2000 at 11:23:02PM -0700, Ben Wing wrote:
i've created a new api for simplifying mule-correct string operations.
it'll take awhile to fully implement; perhaps i could get some help,
e.g. hrvoje?
i will be implementing piece by piece, as necessary.
here's what i have so far, and i'd like comments:
   (E) For working with Eistrings:
   -------------------------------
   NOTE: An Eistring is a structure that makes it easy to work with
   internally-formatted strings of data.  It provides operations similar
   in feel to the standard strcpy(), strcat(), strlen(), etc., but
   (a) it is Mule-correct
   (b) it does dynamic allocation so you never have to worry about size
       restrictions (and all allocation is stack-local using alloca(), so
       there is no need to explicitly clean up)
   (c) it knows its own length, so it does not suffer from standard null
       byte brain-damage
   (d) it provides a much more powerful set of operations and knows about
       all the standard places where string data might reside: Lisp_Objects,
       other Eistrings, char * data with or without an explicit length, etc.
   (e) it provides easy operations to convert to/from externally-formatted
       data, and is much easier to use than the standard TO_INTERNAL_FORMAT
       and TO_EXTERNAL_FORMAT macros.
   The idea is to make it as easy to write Mule-correct string manipulation
   code as it is to write normal string manipulation code.  We also make
   the API sufficiently general that it can handle multiple internal data
   formats (e.g. some fixed-width optimizing formats and a default variable
   width format) and allows for *ANY* data format we might choose in the
   future for the default format, including UCS2. (In other words, we can't
   assume that the internal format is ASCII-compatible and we can't assume
   it doesn't have embedded null bytes.) All of this is hidden from the
   user.
   #### It is really too bad that we don't have a real object-oriented
   language, or at least a language with polymorphism!
   Eistring (name):
        Declare a new Eistring.  This is a standard local variable declaration
        and can go anywhere in the variable declaration section, but note that
        you *MUST* supply the parens.
   ----- Initialization -----
   eicpy_* (eistr, ...):
        Initialize the Eistring from somewhere:
   eicpy_ei (eistr, eistr2):
        ... from another Eistring
   eicpy_str (eistr, lisp_string):
        ... from a Lisp_Object string
   eicpy_str_off (eistr, lisp_string, charpos, charlen):
        ... from a section of a Lisp_Object string
   eicpy_str_off_byte (eistr, lisp_string, bytepos, bytelen):
        ... from a section of a Lisp_Object string, with offset and length
        specified in bytes rather than chars
   eicpy_buf (eistr, lisp_buf, charpos, charlen):
        ... from a Lisp_Object buffer
   eicpy_buf_byte (eistr, lisp_buf, bytepos, bytelen):
        ... from a Lisp_Object buffer, with offset and length specified in
        bytes rather than chars
   eicpy_raw (eistr, intdata, intlen, intfmt):
        ... from raw internal-format data in the specified format
   eicpy_c (eistr, c_string):
        ... from an ASCII null-terminated string.  Non-ASCII characters in
        the string are *ILLEGAL* (read abort() with error-checking defined).
   eicpy_c_len (eistr, c_string, len):
        ... from an ASCII string, with length specified.  Non-ASCII characters
        in the string are *ILLEGAL* (read abort() with error-checking defined).
   eicpy_ext (eistr, extdata, coding_system):
        ... from external null-terminated data, with coding system specified.
   eicpy_ext_len (eistr, extdata, extlen, coding_system):
        ... from external data, with length and coding system specified.
   eicpy_lstream (eistr, lstream):
        ... from an lstream; reads data till eof.  Data must be in default
        internal format; otherwise, interpose a decoding lstream.
   ----- Getting the data out of the Eistring -----
   eirawdata (eistr):
   eimake_string (eistr):
   eimake_string_sect (eistr, charpos, charlen):
   eimake_string_sect_byte (eistr, bytepos, bytelen):
   eicpyout_raw_alloca (eistr, intfmt, intlen_out):
   eicpyout_raw_malloc (eistr, intfmt, intlen_out):
   eicpyout_c_alloca (eistr):
   eicpyout_c_malloc (eistr):
   eicpyout_c_len_alloca (eistr, len_out):
   eicpyout_c_len_malloc (eistr, len_out):
   ----- Moving to the heap -----
   eito_malloc (eistr):
   eifree (eistr):
   eito_alloca (eistr):
   ----- Retrieving the length -----
   eilen (eistr):
   eilen_byte (eistr):
   ----- Working with positions -----
   eicharpos_to_bytepos (eistr, charpos):
   eibytepos_to_charpos (eistr, bytepos):
   ----- Getting the character at a position -----
   eiref (eistr, charpos):
   eiref_byte (eistr, bytepos):
   ----- Concatenation -----
   eicat_* (eistr, ...):
        Concatenate onto the end of the Eistring, with data coming from the
        same places as above. (All functions that take string sources allow
        only two possibilities: Another Eistring and a simple C string.
        In the general case, create another Eistring from the source.)
   eicat_ei (eistr, eistr2):
   eicat_c (eistr, c_string):
   ----- Replacement -----
   eisub_* (eistr, charoff, charlen, ...):
   eisub_*_byte (eistr, byteoff, bytelen, ...):
        Replace a section of the Eistring.
   eisub_ei (eistr, charoff, charlen, eistr2):
   eisub_ei_byte (eistr, byteoff, bytelen, eistr2):
   eisub_c (eistr, charoff, charlen, c_string):
   eisub_c_byte (eistr, byteoff, bytelen, c_string):
   ----- Converting to an external format -----
   eito_external (eistr, coding_system):
   eiextdata (eistr):
   eiextlen (eistr):
   ----- Searching in the Eistring for a character -----
   eichr (eistr, chr):
   eichr_byte (eistr, chr):
   eichr_off (eistr, chr, charpos):
   eichr_off_byte (eistr, chr, bytepos):
   eirchr (eistr, chr):
   eirchr_byte (eistr, chr):
   eirchr_off (eistr, chr, charpos):
   eirchr_off_byte (eistr, chr, bytepos):
   ----- Searching in the Eistring for a string -----
   eistr_ei (eistr, eistr2):
   eistr_ei_byte (eistr, eistr2):
   eistr_ei_off (eistr, eistr2, charpos):
   eistr_ei_off_byte (eistr, eistr2, bytepos):
   eirstr_ei (eistr, eistr2):
   eirstr_ei_byte (eistr, eistr2):
   eirstr_ei_off (eistr, eistr2, charpos):
   eirstr_ei_off_byte (eistr, eistr2, bytepos):
   eistr_c (eistr, c_string):
   eistr_c_byte (eistr, c_string):
   eistr_c_off (eistr, c_string, charpos):
   eistr_c_off_byte (eistr, c_string, bytepos):
   eirstr_c (eistr, c_string):
   eirstr_c_byte (eistr, c_string):
   eirstr_c_off (eistr, c_string, charpos):
   eirstr_c_off_byte (eistr, c_string, bytepos):
   ----- Comparison -----
   eicmp_* (eistr, ...):
   eicmp_off_* (eistr, charoff, charlen, ...):
   eicmp_off_*_byte (eistr, byteoff, bytelen, ...):
   eicasecmp_* (eistr, ...):
   eicasecmp_off_* (eistr, charoff, charlen, ...):
   eicasecmp_off_*_byte (eistr, byteoff, bytelen, ...):
        Compare the Eistring with the other data.  Return value same as
        from strcmp.
   eicmp_ei (eistr, eistr2):
   eicmp_off_ei (eistr, charoff, charlen, eistr2):
   eicmp_off_ei_byte (eistr, byteoff, bytelen, eistr2):
   eicasecmp_ei (eistr, eistr2):
   eicasecmp_off_ei (eistr, charoff, charlen, eistr2):
   eicasecmp_off_ei_byte (eistr, byteoff, bytelen, eistr2):
   eicmp_c (eistr, c_string):
   eicmp_off_c (eistr, charoff, charlen, c_string):
   eicmp_off_c_byte (eistr, byteoff, bytelen, c_string):
   eicasecmp_c (eistr, c_string):
   eicasecmp_off_c (eistr, charoff, charlen, c_string):
   eicasecmp_off_c_byte (eistr, byteoff, bytelen, c_string):
   ----- Case-changing the Eistring -----
   eilwr (eistr):
   eiupr (eistr):
And the implementation:
/* ------------------------------ */
/* (E) For working with Eistrings */
/* ------------------------------ */
typedef struct
{
  void *data;
  Bytecount bytelen;
  Charcount charlen;
  int mallocp;
  void *extdata;
  Extcount extlen;
} Eistring_;
Eistring_ the_eistring_zero_init;
#define Eistring(name) Eistring_ name = the_eistring_zero_init
#define EI_ALLOC_(ei, charlen_, bytelen_)
do {
  ei.charlen = charlen_;
  ei.bytelen = bytelen_;
  if (ei.mallocp)
    ei.data = xmalloc (ei.bytelen + 1);
  else
    ei.data = alloca (ei.bytelen + 1);
} while (0)
#define EI_ALLOC_AND_COPY_(ei, data_, charlen_, bytelen_)
do {
  EI_ALLOC_ (ei, charlen_, bytelen_);
  memcpy (ei.data, data_, ei.bytelen + 1);
} while (0)
#define eicpy_ei(ei, ei2)
do {
  Eistring__ *ei__ = &ei2;
  EI_ALLOC_AND_COPY_ (ei, ei__->data, ei__->charlen, ei__->bytelen);
} while (0)
#define eicpy_str(ei, lisp_string)
do {
  Lisp_Object ei__ = lisp_string;
  EI_ALLOC_AND_COPY_ (ei, XSTRING_DATA (ei__), XSTRING_CHAR_LENGTH
(ei__),
        XSTRING_LENGTH (ei__));
} while (0)