This sounds like a really fun project!
Although I don't have experience with XEmacs internals, about three
years ago I worked on integrating the GNU Objective-C runtime system
with Boehm's garbage collector. Objective-C is one of the, until
recently, three languages supported by GCC, and is a C based, object
oriented language, very dynamic in nature, originally modeled after
Smalltalk. Based on this work, I modified two rather large libraries I
worked on at that time, to use the new memory management system.
What I used to describe the layout of the memory was the so-called
typed memory, which allows one to describe exactly what is the memory
layout of a class or structure. Essentially this tell the GC where are
the pointers in your data structures, and gives the GC a very fast way
to skip over unwanted data.
It turns out however that describing memory layout however is not an
easy task. Different machines have different memory layouts and
getting this right is very machine dependent. Currently the piece of
code that determines the memory layout (which in turn is used to
describe the typed memory for Boehm's GC) is the most problematic and
causes the most headaches in the GNU Objective-C runtime library.
Just a word of caution if you decide to use typed memory. You may want
to take a look at the GCC code in the Objective-C runtime library, to
see how things are done. Checkout gc.c and encoding.c in the libobjc
directory:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libobjc/
Regards,
--
Ovidiu Predescu <ovidiu(a)cup.hp.com>
http://orion.nsr.hp.com/ (inside HP's firewall only)
http://www.geocities.com/SiliconValley/Monitor/7464/ (GNU, Emacs, other stuff)
On 21 May 2001 22:30:52 +0100, Nix <nix(a)esperi.demon.co.uk> wrote:
I'm currently in the middle of a hack of crazy size, jumping in
at the
deep end of XEmacs development, in order to familiarize myself with all
the code.
I'm doing what should have been done a *long* time ago; ditching the
garbage collector and replacing it with a nicer one. I think everyone
can agree that XEmacs's garbage collector currently sucks really rather
hard; I don't think there are any ways to worsen its VM behaviour or
intrusiveness if we tried.
The one I've picked is the Boehm collector; it's not terribly portable,
but that can be fixed fairly easily --- and it lets us dump bloody GCPRO
and all the uglinesses it has spawned over the years (blocking string
compaction, Lispification in redisplay and many other things).
The Boehm collector only has one downside that I can see; unportability
(it's not portable to some of the more obscure platforms that XEmacs
runs on). Upsides include:
- better VM behaviour (mark bits kept in a bitmap, and GC_malloc_atomic()
to state that certain kinds of objects cannot contain pointers). As
it is most of XEmacs is forced to stay memory-resident all the time
by mark bit setting; that proportion will reduce to just those
parts that contain pointers that must be traced. This is my primary
motivation for this because I run XEmacs on some fairly memory-poor
machines and I'm fed up of waiting for bloody GC to finish.
- mark bit sanity; we can reclaim at least one bit from many objects,
and might even get lightweight cons cells back (I'm not sure if one
bit saved is enough for that though). Certainly we can junk the myriad
different ways we have to mark things; this is very much improved
recently but the boehm-gc can fix it completely
- (on some platforms) full incrementality/generationality; we can say a
near-total goodbye to GC delays on those platforms
- the death of GCPRO, and therefore probably also of gc_currently_forbidden
and similar variables. This is my other major motivation; portable
though it is, I *hate* GCPRO with a passion. (Besides, walking the
stack is not that unportable; the unportabilities in the Boehm
collector are in other areas, like incremental collection.)
- and last but not least, it's written and actively maintained by
someone who's very accessible and who's one of the best GC hackers
there is, and it is itself acknowledged as probably the best
general-purpose C garbage collector in existence.
Plus, I think it's a fun hack.
I'm implementing it in such a way that the tying of the specific GC
implementation to Emacs is light; much lighter than the current
one's. So if everyone else thinks the boehm-gc sucks, you can just rip
it out and use the infrastructure left behind.
(btw, I think we should under no circumstances implement a copying
collector; they get less and less impressive the bigger the heap and the
longer-lived its objects, and XEmacs has a large heap and a very large
set of long-lived objects, mainly thanks to the obarray.)
I'll be paying *no* attention to unexec() and friends in this, at least
at first; I'll keep the portable dumper working, but if unexec() is
totally broken by some interaction with the boehm gc, I will not mourn
unduly.
I'm doing the changes to 21.4.3 at first, because I know it's pretty
stable otherwise, so that any crashes I may encounter are my fault. Once
I've got it working fairly stably in there, I'll post the patches (and
yes, I'll split the patches up into pieces; I don't want the
GCPRO-removal to drown out the interesting stuff) and let everyone rip
into them... and then I guess I'll forward-port it to 2.5, and everyone
can let GCPRO fade into the mists of memory (maybe with the aid of
a psychiatrist).
--
`LARTing lusers is supposed to be satisfying. This is just tedious. The
silly shite I'm doing now is like trying to toothpick to death a Black
Knight made of jelly.' --- RDD