god damn, a lot of typing over one little issue!  So I feel compelled to add MO :)

BTW Olivier -- we "rang" you.  Look at issue # 3A.

1.  Optimize for space over speed. (That means you, Martin -- don't even THINK of nanosecond optimizing :)
2.  Make your assumptions clear.
     (e.g. ahead of the use of LIST_LOOP_2(), put a comment
      /* Using LIST_LOOP_2 here is safe because all callers already
         called Flength() to trap bogus lists, and have not done
         *anything* that could possibly result in the lists getting
         changed. */
      )

     Note that this assumption is rather fragile.  Thus, I would be tempted to junk this business of copying the list into the vals array, and instead just use EXTERNAL_LIST_LOOP_3() to iterate in *all* cases.  The resulting code is smaller, easier-to-understand, uses up no more space, and much more robust in that it does not make complex assumptions about what its callers just did.  Speed -- ehhem, Martin -- is just not an issue here.  Trust me.

(Ben's rule of thumb w.r.t. optimization: If you're not changing the Big-O behavior in your quest to optimize (i.e. you're "constant-factor-twiddling"), don't bother.  You have more important stuff to do.  Only consider optimizing such stuff when profiling tells you that you actually will gain significantly by doing so.  "Significant" might be defined as an average 1% pervasive improvement that applies across-the-board in a wide variety of regular-user situations or as a 10% improvement to a specific action that a regular user does on a fairly common basis.  For consideration as "significant", all improvement results must apply to *realistic* situations; artificial benchmarks just don't cut it.)

3. As for the extra string copy in mapconcat, that should really be avoided.  One way to avoid it (the stupid way) is to keep your pointer as a Charcount, and then do a Charcount->Bytecount conversion each time.  This avoids the problem of a character being modified earlier in the string with another of different length in the internal representation.  This also, however, adds an O(N^2) step to things, which isn't good.  Two other possibilities (take advantage of all those extra bits freed up by the indexed-lrecord change):

a. a read-only bit.  I think this already exists, in fact.  Just mark the string read-only while iterating over it; then modify the few functions that change strings (aset, set-string-property, etc.) to reflect the read-only state and signal an error.  You could even implement this for conses; then, you could mark the whole list that you're iterating over as read-only (by marking each cons), and you wouldn't run into any of these crashes.

There are tons of other places in XEmacs where having read-only lists and strings would be truly wonderful.  Currently, all over XEmacs there are places where internal data is copied before being returned as a return value -- especially in the specifier and glyph code -- to make sure that it's impossible for someone to modify and thereby mess up internal data.  Huge amounts of copying could be eliminated through read-only lists and strings. Implementation should be easy -- Olivier?

b. a string dirty bit.  If someone thinks we should actually *allow* the string to be modified during the mapping, then we can deal with this situation with a string dirty bit, which is set whenever a modification to the string happens.  If the dirty bit is set, we go back to our Charcount, convert Charcount->Bytecount, and reset the dirty bit; otherwise we continue using our Bytecount as normal: i.e. (off the top of my head)

 else if (STRINGP (seq))
   {
     Bytecount pcnt;
     Bufbyte *p, *oldp;

     for (i = 0, bcnt = 0; i < leni; i++)
       {
         /* The string data of `seq' might be relocated during GC,
            so retrieve the pointer each time it's needed. */
         Bufbyte *origp = XSTRING_DATA (seq);
         if (XSTRING_DIRTY_P (seq))
           {
             bcnt = charcount_to_bytecount (origp, i);
             set_string_dirty_p (seq, 0);
           }
         oldp = p = origp + bcnt;
         args[1] = make_char (charptr_emchar (p));
         INC_CHARPTR (p);
         bcnt += p - oldp;
         result = Ffuncall (2, args);
         if (vals) vals[gcpro1.nvars++] = result;
       }
   }

ben

Martin Buchholz wrote:

>>>>> "Hrv" == Hrvoje Niksic <hniksic@iskon.hr> writes:

Hrv> Martin Buchholz <martin@xemacs.org> writes:
Jan> Why not do that? We can also do away with the caller calling
Jan> Flength when the results are not needed. Then we would only
Jan> loop over the list once and in addition the size of the list we
Jan> can loop over is not limited by the stack size limit.
>>
>> It's not obvious to me which implementation is faster.

Hrv> But yours introduces additional stack-allocation in the `mapc' case.

You guys sure are a tough bunch to please!

Here is Patch version 4, with these new and improved features:

- test cases added to test suite.
- test case functions have docstrings and cool comments.
- much more verbose ChangeLog entries
- use EXTERNAL_LIST_LOOP as suggested by Hrvoje

src/ChangeLog:

1999-12-17  Martin Buchholz  <martin@xemacs.org>

        * fns.c (mapcar1): Fix ***THREE*** obscure crashes in one function!
        - Two of those involve evil mapping functions that destructively
          modify a list being mapped over.
        - Any garbage collection when mapping over a string could cause a
          crash (typically in mapconcat).

tests/ChangeLog:

1999-12-17  Martin Buchholz  <martin@xemacs.org>

        * automated/lisp-tests.el:  Add tests for mapcar1() crashes.

Index: src/fns.c
===================================================================
RCS file: /usr/CVSroot/XEmacs/xemacs/src/fns.c,v
retrieving revision 1.30.2.23
diff -u -w -r1.30.2.23 fns.c
--- fns.c       1999/10/24 03:48:39     1.30.2.23
+++ fns.c       1999/12/17 23:43:31
@@ -3068,14 +3068,54 @@

   if (LISTP (seq))
     {
+      /* A devious `fn' could either:
+        - insert garbage into the list in front of us, causing XCDR to crash
+        - amputate the list behind us using (setcdr), causing the remaining
+          elts to lose their GCPRO status.
+
+        if (vals != 0) we avoid this by copying the elts into the
+        `vals' array.  By a stroke of luck, `vals' is exactly large
+        enough to hold the elts left to be traversed as well as the
+        results computed so far.
+
+        if (vals == 0) we don't have any free space available and
+        don't want to eat up any more stack with alloca().
+        So we use EXTERNAL_LIST_LOOP_3 and GCPRO the tail. */
+
+      if (vals)
+       {
+         Lisp_Object *val = vals;
+         Lisp_Object elt;
+
+         LIST_LOOP_2 (elt, seq)
+             *val++ = elt;
+
+         gcpro1.nvars = leni;
+
       for (i = 0; i < leni; i++)
        {
-         args[1] = XCAR (seq);
-         seq = XCDR (seq);
-         result = Ffuncall (2, args);
-         if (vals) vals[gcpro1.nvars++] = result;
+             args[1] = vals[i];
+             vals[i] = Ffuncall (2, args);
        }
     }
+      else
+       {
+         Lisp_Object elt, tail;
+         struct gcpro ngcpro1;
+
+         NGCPRO1 (tail);
+
+         {
+           EXTERNAL_LIST_LOOP_3 (elt, seq, tail)
+             {
+               args[1] = elt;
+               Ffuncall (2, args);
+             }
+         }
+
+         NUNGCPRO;
+       }
+    }
   else if (VECTORP (seq))
     {
       Lisp_Object *objs = XVECTOR_DATA (seq);
@@ -3088,8 +3128,14 @@
     }
   else if (STRINGP (seq))
     {
-      Bufbyte *p = XSTRING_DATA (seq);
-      for (i = 0; i < leni; i++)
+      /* The string data of `seq' might be relocated during GC. */
+      Bytecount slen = XSTRING_LENGTH (seq);
+      Bufbyte *p = alloca_array (Bufbyte, slen);
+      Bufbyte *end = p + slen;
+
+      memcpy (p, XSTRING_DATA (seq), slen);
+
+      while (p < end)
        {
          args[1] = make_char (charptr_emchar (p));
          INC_CHARPTR (p);
Index: tests/automated/lisp-tests.el
===================================================================
RCS file: /usr/CVSroot/XEmacs/xemacs/tests/automated/Attic/lisp-tests.el,v
retrieving revision 1.1.2.7
diff -u -w -r1.1.2.7 lisp-tests.el
--- lisp-tests.el       1999/09/17 19:30:36     1.1.2.7
+++ lisp-tests.el       1999/12/17 23:43:31
@@ -755,6 +755,29 @@
 (Assert (equal (mapconcat #'identity '("1" "2" "3") "|") "1|2|3"))
 (Assert (equal (mapconcat #'identity ["1" "2" "3"]  "|") "1|2|3"))

+;; The following 2 functions used to crash XEmacs via mapcar1().
+;; We don't test the actual values of the mapcar, since they're undefined.
+(Assert
+ (let ((x (list (cons 1 1) (cons 2 2) (cons 3 3))))
+   (mapcar
+    (lambda (y)
+      "Devious evil mapping function"
+      (when (eq (car y) 2) ; go out onto a limb
+       (setcdr x nil)     ; cut it off behind us
+       (garbage-collect)) ; are we riding a magic broomstick?
+      (car y))             ; sorry, hard landing
+    x)))
+
+(Assert
+ (let ((x (list (cons 1 1) (cons 2 2) (cons 3 3))))
+   (mapcar
+    (lambda (y)
+      "Devious evil mapping function"
+      (when (eq (car y) 1)
+       (setcdr (cdr x) 42)) ; drop a brick wall onto the freeway
+      (car y))
+    x)))
+
 ;;-----------------------------------------------------
 ;; Test vector functions
 ;;-----------------------------------------------------

--
In order to save my hands, I am cutting back on my responses, especially
to XEmacs-related mail.  You _will_ get a response, but please be patient.
If you need an immediate response and it is not apparent in your message,
please say so.  Thanks for your understanding.