"Stephen J. Turnbull" <turnbull(a)sk.tsukuba.ac.jp> writes:
> Hrvoje> ...the point I'm trying to make is that interface to
> Hrvoje> user-visible Lisp functions should be well thought-out, or
> Hrvoje> problems will arise.
>
> Yes, indeed. I shudder to think what working on Mule would be like
> if Ben had not founded his work on that principle as applied to
> internal APIs.
Agreed. Ben has done excellent work, and I cannot help but wonder
what the original MULE 2.0 must have looked like. For people who
doesn't understand what I'm talking about, here is our and the latest
FSF version of Fcompare_buffer_substrings (the latter just just having
been rewritten by Stallman):
XEmacs:
DEFUN ("compare-buffer-substrings", Fcompare_buffer_substrings, 6, 6, 0, /*
Compare two substrings of two buffers; return result as number.
the value is -N if first string is less after N-1 chars,
+N if first string is greater after N-1 chars, or 0 if strings match.
Each substring is represented as three arguments: BUFFER, START and END.
That makes six args in all, three for each substring.
The value of `case-fold-search' in the current buffer
determines whether case is significant or ignored.
*/
(buffer1, start1, end1, buffer2, start2, end2))
{
Bufpos begp1, endp1, begp2, endp2;
REGISTER Charcount len1, len2, length, i;
struct buffer *bp1, *bp2;
Lisp_Object trt = ((!NILP (current_buffer->case_fold_search)) ?
current_buffer->case_canon_table : Qnil);
/* Find the first buffer and its substring. */
bp1 = decode_buffer (buffer1, 1);
get_buffer_range_char (bp1, start1, end1, &begp1, &endp1, GB_ALLOW_NIL);
/* Likewise for second substring. */
bp2 = decode_buffer (buffer2, 1);
get_buffer_range_char (bp2, start2, end2, &begp2, &endp2, GB_ALLOW_NIL);
len1 = endp1 - begp1;
len2 = endp2 - begp2;
length = len1;
if (len2 < length)
length = len2;
for (i = 0; i < length; i++)
{
Emchar c1 = BUF_FETCH_CHAR (bp1, begp1 + i);
Emchar c2 = BUF_FETCH_CHAR (bp2, begp2 + i);
if (!NILP (trt))
{
c1 = TRT_TABLE_OF (trt, c1);
c2 = TRT_TABLE_OF (trt, c2);
}
if (c1 < c2)
return make_int (- 1 - i);
if (c1 > c2)
return make_int (i + 1);
}
/* The strings match as far as they go.
If one is shorter, that one is less. */
if (length < len1)
return make_int (length + 1);
else if (length < len2)
return make_int (- length - 1);
/* Same length too => they are equal. */
return Qzero;
}
The latest FSFmacs:
DEFUN ("compare-buffer-substrings", Fcompare_buffer_substrings, Scompare_buffer_substrings,
6, 6, 0,
"Compare two substrings of two buffers; return result as number.\n\
the value is -N if first string is less after N-1 chars,\n\
+N if first string is greater after N-1 chars, or 0 if strings match.\n\
Each substring is represented as three arguments: BUFFER, START and END.\n\
That makes six args in all, three for each substring.\n\n\
The value of `case-fold-search' in the current buffer\n\
determines whether case is significant or ignored.")
(buffer1, start1, end1, buffer2, start2, end2)
Lisp_Object buffer1, start1, end1, buffer2, start2, end2;
{
register int begp1, endp1, begp2, endp2, temp;
register struct buffer *bp1, *bp2;
register Lisp_Object *trt
= (!NILP (current_buffer->case_fold_search)
? XCHAR_TABLE (current_buffer->case_canon_table)->contents : 0);
int chars = 0;
int i1, i2, i1_byte, i2_byte;
/* Find the first buffer and its substring. */
if (NILP (buffer1))
bp1 = current_buffer;
else
{
Lisp_Object buf1;
buf1 = Fget_buffer (buffer1);
if (NILP (buf1))
nsberror (buffer1);
bp1 = XBUFFER (buf1);
if (NILP (bp1->name))
error ("Selecting deleted buffer");
}
if (NILP (start1))
begp1 = BUF_BEGV (bp1);
else
{
CHECK_NUMBER_COERCE_MARKER (start1, 1);
begp1 = XINT (start1);
}
if (NILP (end1))
endp1 = BUF_ZV (bp1);
else
{
CHECK_NUMBER_COERCE_MARKER (end1, 2);
endp1 = XINT (end1);
}
if (begp1 > endp1)
temp = begp1, begp1 = endp1, endp1 = temp;
if (!(BUF_BEGV (bp1) <= begp1
&& begp1 <= endp1
&& endp1 <= BUF_ZV (bp1)))
args_out_of_range (start1, end1);
/* Likewise for second substring. */
if (NILP (buffer2))
bp2 = current_buffer;
else
{
Lisp_Object buf2;
buf2 = Fget_buffer (buffer2);
if (NILP (buf2))
nsberror (buffer2);
bp2 = XBUFFER (buf2);
if (NILP (bp2->name))
error ("Selecting deleted buffer");
}
if (NILP (start2))
begp2 = BUF_BEGV (bp2);
else
{
CHECK_NUMBER_COERCE_MARKER (start2, 4);
begp2 = XINT (start2);
}
if (NILP (end2))
endp2 = BUF_ZV (bp2);
else
{
CHECK_NUMBER_COERCE_MARKER (end2, 5);
endp2 = XINT (end2);
}
if (begp2 > endp2)
temp = begp2, begp2 = endp2, endp2 = temp;
if (!(BUF_BEGV (bp2) <= begp2
&& begp2 <= endp2
&& endp2 <= BUF_ZV (bp2)))
args_out_of_range (start2, end2);
i1 = begp1;
i2 = begp2;
i1_byte = buf_charpos_to_bytepos (bp1, i1);
i2_byte = buf_charpos_to_bytepos (bp2, i2);
while (i1 < endp1 && i2 < endp2)
{
/* When we find a mismatch, we must compare the
characters, not just the bytes. */
int c1, c2;
if (! NILP (bp1->enable_multibyte_characters))
{
c1 = BUF_FETCH_MULTIBYTE_CHAR (bp1, i1_byte);
BUF_INC_POS (bp1, i1_byte);
i1++;
}
else
{
c1 = BUF_FETCH_BYTE (bp1, i1);
c1 = unibyte_char_to_multibyte (c1);
i1++;
}
if (! NILP (bp2->enable_multibyte_characters))
{
c2 = BUF_FETCH_MULTIBYTE_CHAR (bp2, i2_byte);
BUF_INC_POS (bp2, i2_byte);
i2++;
}
else
{
c2 = BUF_FETCH_BYTE (bp2, i2);
c2 = unibyte_char_to_multibyte (c2);
i2++;
}
if (trt)
{
c1 = XINT (trt[c1]);
c2 = XINT (trt[c2]);
}
if (c1 < c2)
return make_number (- 1 - chars);
if (c1 > c2)
return make_number (chars + 1);
chars++;
}
/* The strings match as far as they go.
If one is shorter, that one is less. */
if (chars < endp1 - begp1)
return make_number (chars + 1);
else if (chars < endp2 - begp2)
return make_number (- chars - 1);
/* Same length too => they are equal. */
return make_number (0);
}
59 and 156 lines respectively, for *completely* equivalent
functionality. So much for XEmacs being "large" and "bloated". I
won't even comment the clarity of code. Just regard the beautiful
code repetition between BUF_FETCH_BYTE and BUF_FETCH_MULTIBYTE_CHAR
cases.
And the original MULE was likely much *worse* than this.
> The bare minimum that will be necessary for any Mule-ized function
> whose value may end up being used by tools outside of XEmacs is
> going to be that it be told how its value will be used, so that it
> can choose an appropriate representation relative to which it will
> compute the value.
That's OK. Again, I definitely realize that Fmd5 must have internal
file-coding built-in.
> In the case of Fmd5, it should normally be possible to guess the
> usage from properties of the source of the text, so the variable
> could be used. On the other hand, for other functions it may rarely
> be possible; if such cases are the majority, then for consistency of
> the Mule interface we would want an argument, I think. How do you
> see it?
I don't know enough about Mule to be able to answer this
authoritatively. Offhand, I think I prefer a set of variables set by
the environment, and respected by the primitives such as `md5' and
`save-buffer'.
--
Hrvoje Niksic <hniksic(a)srce.hr> | Student at FER Zagreb, Croatia
--------------------------------+--------------------------------
* Q: What is an experienced Emacs user?
* A: A person who wishes that the terminal had pedals.