>>>> "MS" == Michael Sperber
<sperber(a)informatik.uni-tuebingen.de> writes:
MS> At first I thought it was Gnus because that's currently the most
MS> performance-critical app here: A beta XEmacs 21.2 that's run for a
MS> few days finished this benchmark:
MS> (let ((count 1000000))
MS> (while (> count 0)
MS> (cons 1 2)
MS> (setq count (- count 1))))
MS> in 52 seconds on my 43p-140. A fresh one, however, does the trick in
MS> 19 seconds.
MS> Is this to be expected because of the debugging code? I don't seem to
MS> remember that it's always been that way. If it's not, this is a
MS> serious problem.
A few points:
- it seems unfair to test the performance of code that hasn't been
byte-compiled.
- Unfortunately, if you test your example byte-compiled, the byte
optimizer will optimize away the call to cons.
- Let's try tricking the optimizer into keeping cons.
(progn
(clear-profiling-info)
(byte-compile (defun myfun ()
(let ((count 10000000))
(while (> count 0)
(setq z (cons 1 2))
(setq count (- count 1))))))
(profile (myfun))
(call-interactively 'profile-results))
(Hrvoje, shouldn't there be a macro to encapsulate this idiom?)
If you run the above, you'll notice that your xemacs process will grow
enormously. I think this is because gc doesn't actually get called.
gc is triggered by eval and funcall, which aren't called above - it
never leaves the bytecode interpreter for the one function. After
some thought, it's obvious that gc should be triggered by lisp object
allocation, not by lisp evaluation. Basically, code that generates
garbage (should) call INCREMENT_CONS_COUNTER, and that should check
for a possible gc. This should also speed up xemacs a tiny bit, since
there should be more funcalls than object creation. However,
currently code relies on Fcons() not calling gc.
If I try
(progn
(clear-profiling-info)
(profile
(let ((count 10000000))
(while (> count 0)
(setq z (cons 1 2))
(setq count (- count 1)))))
(call-interactively 'profile-results))
I get
Function Name Ticks %/Total Call Count
======================= ===== ======= ==========
setq 4094 26.909 20000000
(in garbage collection) 3664 24.083
2292 15.065 10000001
-
2096 13.777 10000000
cons 1856 12.199 10000000
while 1212 7.966 1
--------------------------------------------------------
Total 15214 100.00
But I don't really believe those numbers. Each tick is supposed to be
1 ms, but ps shows 80 ms of cpu time used. At least the garbage is
regularly taken out and recycled during this run, and the process is
only slightly larger afterwards.
If running gnus or some other big app, this will probably be
gc-bound.
Yes, we need to implement a generational gc.
Make a write barrier for all lisp object types. (Hmmm, will make setq
and let slow until lexical variables are implemented)
Add a generation field to lrecords (perhaps just two generations - the
buzzword is "object nursery").
keep track of pointers from old generations to new.
Easy, right?