Er, I believe someone has already done something similar and is writing
their PhD dissertation on it.
andy
At 10:30 PM 5/21/01 +0100, Nix wrote:
I'm currently in the middle of a hack of crazy size, jumping in at
the
deep end of XEmacs development, in order to familiarize myself with all
the code.
I'm doing what should have been done a *long* time ago; ditching the
garbage collector and replacing it with a nicer one. I think everyone
can agree that XEmacs's garbage collector currently sucks really rather
hard; I don't think there are any ways to worsen its VM behaviour or
intrusiveness if we tried.
The one I've picked is the Boehm collector; it's not terribly portable,
but that can be fixed fairly easily --- and it lets us dump bloody GCPRO
and all the uglinesses it has spawned over the years (blocking string
compaction, Lispification in redisplay and many other things).
The Boehm collector only has one downside that I can see; unportability
(it's not portable to some of the more obscure platforms that XEmacs
runs on). Upsides include:
- better VM behaviour (mark bits kept in a bitmap, and GC_malloc_atomic()
to state that certain kinds of objects cannot contain pointers). As
it is most of XEmacs is forced to stay memory-resident all the time
by mark bit setting; that proportion will reduce to just those
parts that contain pointers that must be traced. This is my primary
motivation for this because I run XEmacs on some fairly memory-poor
machines and I'm fed up of waiting for bloody GC to finish.
- mark bit sanity; we can reclaim at least one bit from many objects,
and might even get lightweight cons cells back (I'm not sure if one
bit saved is enough for that though). Certainly we can junk the myriad
different ways we have to mark things; this is very much improved
recently but the boehm-gc can fix it completely
- (on some platforms) full incrementality/generationality; we can say a
near-total goodbye to GC delays on those platforms
- the death of GCPRO, and therefore probably also of gc_currently_forbidden
and similar variables. This is my other major motivation; portable
though it is, I *hate* GCPRO with a passion. (Besides, walking the
stack is not that unportable; the unportabilities in the Boehm
collector are in other areas, like incremental collection.)
- and last but not least, it's written and actively maintained by
someone who's very accessible and who's one of the best GC hackers
there is, and it is itself acknowledged as probably the best
general-purpose C garbage collector in existence.
Plus, I think it's a fun hack.
I'm implementing it in such a way that the tying of the specific GC
implementation to Emacs is light; much lighter than the current
one's. So if everyone else thinks the boehm-gc sucks, you can just rip
it out and use the infrastructure left behind.
(btw, I think we should under no circumstances implement a copying
collector; they get less and less impressive the bigger the heap and the
longer-lived its objects, and XEmacs has a large heap and a very large
set of long-lived objects, mainly thanks to the obarray.)
I'll be paying *no* attention to unexec() and friends in this, at least
at first; I'll keep the portable dumper working, but if unexec() is
totally broken by some interaction with the boehm gc, I will not mourn
unduly.
I'm doing the changes to 21.4.3 at first, because I know it's pretty
stable otherwise, so that any crashes I may encounter are my fault. Once
I've got it working fairly stably in there, I'll post the patches (and
yes, I'll split the patches up into pieces; I don't want the
GCPRO-removal to drown out the interesting stuff) and let everyone rip
into them... and then I guess I'll forward-port it to 2.5, and everyone
can let GCPRO fade into the mists of memory (maybe with the aid of
a psychiatrist).
--
`LARTing lusers is supposed to be satisfying. This is just tedious. The
silly shite I'm doing now is like trying to toothpick to death a Black
Knight made of jelly.' --- RDD