On Tue, 22 May 2001, Martin Buchholz stated:
Richard Reingruber is working on a new GC. It can be found in the
CVS branch NewGC-21-2. Richard is quiet. Search for his few postings
on xemacs-beta. Richard is a student of Michael Sperber.
It seems that it needs a bit of a forward port, at least :)
Yoshiki Hayashi has written a copying gc for xemacs recently. It is
not yet ready for prime time. His is precise, i.e. he has to maintain
all the gcpros, and actually add a few, since the copying involves
finding and updating ALL the places on the stack where Lisp_Objects
live.
The real problem with GCPROs is that they are nasty to maintain; the
type-accuracy they provide is good. It occurs to me that I could make
GCPROs maintainable by making all the alloc_*() routines do an automatic
GCPRO (well, do the same thing to the gcpro roots that GCPRO now does),
and could write a fairly simple preprocessor that takes (one file of)
the XEmacs C code and spits out a (transient) source file with
UNGCPRO-alikes inserted at the end of each block where they are
allocated. That would let us dump visible GCPROs in the source yet keep
type-accuracy...
I've taken a look at the Boehm gc in the past, considering its
possible use in XEmacs. I rejected using it because:
- it's non-portable (reading its portability layer is frightening)
That's my biggest worry, too. I don't really mind which GC I go to,
really; my primary goals are zapping the GCPRO construct as it currently
exists (although keeping its type-accuracy would be nice) and getting
something with better VM behaviour. Both of these can be done without
actually replacing the GC; it just seemed simpler to do that.
However, the gcc project has recently started using boehm gc. Since
gcc is very portable, they must be making any necessary portability
improvements to boehm gc as they go.
Not yet. There are many architectures that GCC works on that libjava and
parts of libobjc (the primary users of that gc) do not work on, and the
boehm-gc in the GCC tree is a rather aged implementation.
When version 6 comes out Hans has said he will move to making the GCC
tree's copy of the collector the master copy; if I'm still using the
boehm-gc in XEmacs by then, I'll migrate it to that version.
Hmmm. It seems to me mark() has to examine most of the heap
anyways.
Unless you do a lot of work, boehm gc might end up examining even more
memory than the existing mark. Most of the guts of existing
Lisp_Objects are other Lisp_Objects, so there isn't a whole lot of
scope for GC_malloc_atomic() to be a big win.
This isn't true of anything except for cons cells and other sequences,
and things like hash tables, is it? For `leaf' objects (string data &c),
a conservative collector like Boehm's would wind up tracing them for
pointers they couldn't possibly contain pointers to anything.
(It's true that the gains from this bit over the *existing* allocator
would be marginal; it reduces the negative effects of using a
conservative collector substantially, though. It's a tradeoff; GCPRO
versus the occasional malloc_atomic()... but the more I think about it
the more it seems to me that a preprocessor that automatically inserts
UNGCPROs is the right way to go; type-accuracy *and* maintainability,
wow ;) )
we can also put the mark-bits into a bitmap without complete
boehmification. Understanding and extracting that code from boehm gc
would be a much smaller and still very instructive project.
True. I think, for the first implementation, I'll go with
-- Richard's infrastructure (general modifications), forward-ported to
21.4; they look praiseworthy, and the forward-porting job doesn't
look too enormous
-- A bitmapped version of the current GC
-- and, if I can get it to work --- and I damned well will ;) --- a
GCPROizing preprocessor, so we can dump *visible* evidence of GCPRO
(that being what does the harm; we can forget it exists completely
if it is automatically maintained!)
(I might even be able to make it so that only one GCPRO/UNGCPRO
operation needs to be done for an entire stack frame, because the
preprocessor knows exactly what Lisp_Objects have been allocated in
a given frame, so it can insert a single operation that does all the
work...)
I encourage you on your adventure. However, some advice:
- communicate with Richard and Yoshiki.
- consider implementing mark bitmaps only in the existing gc.
Agreed and agreed (see above).
- investigate the version of boehm gc from the gcc project.
Absolutely.
- of course, you do have the blue book called "Garbage
Collection",
right?
Of course! Wonderful book it is, too. (I originally bought it many moons
ago with the explicit intention of using it to improve XEmacs's GC, but
it's taken me this long to assimilate it properly.)
--
`LARTing lusers is supposed to be satisfying. This is just tedious. The
silly shite I'm doing now is like trying to toothpick to death a Black
Knight made of jelly.' --- RDD