Forthcoming revisions to the garbage collector; crazy huge hack

older

Re: In Search for better...

gcc-2.95.3 and Athena scrollbars.

Nix

Monday, 21 May 2001 Mon, 21 May '01

10:30 p.m.

(permalink)

I'm currently in the middle of a hack of crazy size, jumping in at the deep end of XEmacs development, in order to familiarize myself with all the code. I'm doing what should have been done a *long* time ago; ditching the garbage collector and replacing it with a nicer one. I think everyone can agree that XEmacs's garbage collector currently sucks really rather hard; I don't think there are any ways to worsen its VM behaviour or intrusiveness if we tried. The one I've picked is the Boehm collector; it's not terribly portable, but that can be fixed fairly easily --- and it lets us dump bloody GCPRO and all the uglinesses it has spawned over the years (blocking string compaction, Lispification in redisplay and many other things). The Boehm collector only has one downside that I can see; unportability (it's not portable to some of the more obscure platforms that XEmacs runs on). Upsides include: - better VM behaviour (mark bits kept in a bitmap, and GC_malloc_atomic() to state that certain kinds of objects cannot contain pointers). As it is most of XEmacs is forced to stay memory-resident all the time by mark bit setting; that proportion will reduce to just those parts that contain pointers that must be traced. This is my primary motivation for this because I run XEmacs on some fairly memory-poor machines and I'm fed up of waiting for bloody GC to finish. - mark bit sanity; we can reclaim at least one bit from many objects, and might even get lightweight cons cells back (I'm not sure if one bit saved is enough for that though). Certainly we can junk the myriad different ways we have to mark things; this is very much improved recently but the boehm-gc can fix it completely - (on some platforms) full incrementality/generationality; we can say a near-total goodbye to GC delays on those platforms - the death of GCPRO, and therefore probably also of gc_currently_forbidden and similar variables. This is my other major motivation; portable though it is, I *hate* GCPRO with a passion. (Besides, walking the stack is not that unportable; the unportabilities in the Boehm collector are in other areas, like incremental collection.) - and last but not least, it's written and actively maintained by someone who's very accessible and who's one of the best GC hackers there is, and it is itself acknowledged as probably the best general-purpose C garbage collector in existence. Plus, I think it's a fun hack. I'm implementing it in such a way that the tying of the specific GC implementation to Emacs is light; much lighter than the current one's. So if everyone else thinks the boehm-gc sucks, you can just rip it out and use the infrastructure left behind. (btw, I think we should under no circumstances implement a copying collector; they get less and less impressive the bigger the heap and the longer-lived its objects, and XEmacs has a large heap and a very large set of long-lived objects, mainly thanks to the obarray.) I'll be paying *no* attention to unexec() and friends in this, at least at first; I'll keep the portable dumper working, but if unexec() is totally broken by some interaction with the boehm gc, I will not mourn unduly. I'm doing the changes to 21.4.3 at first, because I know it's pretty stable otherwise, so that any crashes I may encounter are my fault. Once I've got it working fairly stably in there, I'll post the patches (and yes, I'll split the patches up into pieces; I don't want the GCPRO-removal to drown out the interesting stuff) and let everyone rip into them... and then I guess I'll forward-port it to 2.5, and everyone can let GCPRO fade into the mists of memory (maybe with the aid of a psychiatrist). -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Show replies by date

Ovidiu Predescu

Monday, 21 May Mon, 21 May

11:07 p.m.

(permalink)

This sounds like a really fun project! Although I don't have experience with XEmacs internals, about three years ago I worked on integrating the GNU Objective-C runtime system with Boehm's garbage collector. Objective-C is one of the, until recently, three languages supported by GCC, and is a C based, object oriented language, very dynamic in nature, originally modeled after Smalltalk. Based on this work, I modified two rather large libraries I worked on at that time, to use the new memory management system. What I used to describe the layout of the memory was the so-called typed memory, which allows one to describe exactly what is the memory layout of a class or structure. Essentially this tell the GC where are the pointers in your data structures, and gives the GC a very fast way to skip over unwanted data. It turns out however that describing memory layout however is not an easy task. Different machines have different memory layouts and getting this right is very machine dependent. Currently the piece of code that determines the memory layout (which in turn is used to describe the typed memory for Boehm's GC) is the most problematic and causes the most headaches in the GNU Objective-C runtime library. Just a word of caution if you decide to use typed memory. You may want to take a look at the GCC code in the Objective-C runtime library, to see how things are done. Checkout gc.c and encoding.c in the libobjc directory: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libobjc/ Regards, -- Ovidiu Predescu <ovidiu(a)cup.hp.com> http://orion.nsr.hp.com/ (inside HP's firewall only) http://www.geocities.com/SiliconValley/Monitor/7464/ (GNU, Emacs, other stuff) On 21 May 2001 22:30:52 +0100, Nix <nix(a)esperi.demon.co.uk> wrote:

...

Nix

Monday, 21 May Mon, 21 May

11:51 p.m.

(permalink)

On Mon, 21 May 2001, Ovidiu Predescu said:

...

This sounds like a really fun project!

Absolutely. I like cleaning up ugly cruft, and the XEmacs GC layer is really rather in need of a good clean, encrufted by sheer age. [typed memory]

...

It turns out however that describing memory layout however is not an easy task. Different machines have different memory layouts and getting this right is very machine dependent. Currently the piece of code that determines the memory layout (which in turn is used to describe the typed memory for Boehm's GC) is the most problematic and causes the most headaches in the GNU Objective-C runtime library.

I plan to try out the _EXPLICITLY_TYPED() stuff, but it would indeed be a nightmare to implement portably, even with heavy use of offsetof(). I agree with Hans when he said in gc_typed.h * Should be used only for extremely performance critical applications, * or if conservative collector leakage is otherwise a problem (unlikely). ... so I plan to let XEmacs run for a while both with and without heavy use (nonportably implemented, quickly hacked up ;) ) of typed memory and compare their memory hits. I doubt it'll have all that much of an impact, because most of XEmacs's data structures are either stuffed with pointers, or totally devoid of them. What'll really have an impact is using GC_MALLOC_ATOMIC() in the right places. (Thank goodness for the objectified nature of XEmacs's memory allocation code!)

...

Just a word of caution if you decide to use typed memory. You may want to take a look at the GCC code in the Objective-C runtime library, to see how things are done. Checkout gc.c and encoding.c in the libobjc directory:

The method you use there (__objc_generate_gc_type_description() and friends) depends intimately on a description of the layout of the type being provided by the compiler. Of course, this is practical for libobjc; but I rather doubt that I can convince the SC that a `Typed GNU C' extension is wise when offsetof() will do the same job :) (And, in any case, XEmacs has to build with non-GCC compilers, distasteful though the thought is ;) ) Myself, I'm still slightly worried about what this will do to the weird old architectures XEmacs presumably still builds on. I guess if anyone complains about any of them the collector can be ported to them as needed; certainly it can't be ported *before* that :) and does anyone still use the latest XEmacs on a Masscomp, or a Gould, or a Tahoe? Hell, some of those targets were removed from GCC last year because they'd not been ported from GCC 1 in ten years, and *nobody* had ever complained... So I'll bite the bullet and do it, and if it's alreay been done better or I've done it in an insane manner you can all laugh at me and -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Stephen J. Turnbull

Tuesday, 22 May Tue, 22 May

5:30 a.m.

(permalink)

...

>>>> "Nix" == Nix <nix(a)esperi.demon.co.uk> writes:

Nix> On Mon, 21 May 2001, Ovidiu Predescu said:

...

> This sounds like a really fun project!

Nix> Absolutely. I like cleaning up ugly cruft, and the XEmacs GC Nix> layer is really rather in need of a good clean, encrufted by Nix> sheer age. Well, it would be nice if you could be compatible with Richard Reingrub's basic outline. Then we could choose among similarly implemented GCs with different basic strategies, and avoid the question of "is it the GC or the way it's hooked into XEmacs?" BTW, I can't think of a good way to see what has been proposed, except by reading tons of old xemacs-beta posts. Ben's Architecting XEmacs page (link from www.xemacs.org) is the most complete (although dated by now). But you can see some of what's in the process of implementation by doing "cvs status -v version.sh" (in particular, grep for "gc" or "GC" to find Richard Reingrub's branch). I checked fairly recently with our host, and cvs.xemacs.org has sufficient bandwidth and disk space to add a more branches like that. So once you've got things in shape where you're willing to let other people look at it, talk to martin(a)xemacs.org about getting CVS commit privileges, create a branch, and let us know about it. :-) Nix> Myself, I'm still slightly worried about what this will do to Nix> the weird old architectures XEmacs presumably still builds Nix> on. Don't worry; for released XEmacsen your collector will surely go in as a configure option at first. If they still are in use but can't use your collector, so be it---the configure option won't work for them. -- University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Institute of Policy and Planning Sciences Tel/fax: +81 (298) 53-5091 _________________ _________________ _________________ _________________ What are those straight lines for? "XEmacs rules."

Nix

Tuesday, 22 May Tue, 22 May

8:04 a.m.

(permalink)

On Tue, 22 May 2001, Stephen J. Turnbull stated:

...

>>>>> "Nix" == Nix <nix(a)esperi.demon.co.uk> writes: Nix> Myself, I'm still slightly worried about what this will do to Nix> the weird old architectures XEmacs presumably still builds Nix> on. Don't worry; for released XEmacsen your collector will surely go in as a configure option at first. If they still are in use but can't use your collector, so be it---the configure option won't work for them.

In practice, this means that part of the patch would go in; probably most of it, but not the GCPRO-junking bit. When it works for lots of people, we can junk GCPRO anyway. (After all, you can't junk GCPRO with a configure option; the horrible thing is too pervasive. :( ) Thanks for the pointer to Richard's outline; I'll have a look at it. There are only so many ways to implement the GC layer, so probably I'll be quite happy with the way Richard's doing it and can pitch in there. (This means that some of the really boring stuff will have been done for me :) ) -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Michael Sperber

Tuesday, 22 May Tue, 22 May

9:40 a.m.

(permalink)

...

>>>> "Nix" == Nix <nix(a)esperi.demon.co.uk> writes:

Nix> On Mon, 21 May 2001, Ovidiu Predescu said:

...

> This sounds like a really fun project!

Nix> Absolutely. I like cleaning up ugly cruft, and the XEmacs GC layer is Nix> really rather in need of a good clean, encrufted by sheer age. Nix> [typed memory]

...

> It turns out however that describing memory layout however is not an > easy task.

Actually, with Richard's changes, this particular part is trivial: there are only two sorts of objects, those containing only consecutive descriptors, and those containing only bitmap data not relevant to the GC. No more type-specific mark functions, and trivial tracing. -- Cheers =8-} Mike Friede, Völkerverständigung und überhaupt blabla

Andy Piper

Monday, 21 May Mon, 21 May

11:19 p.m.

(permalink)

Er, I believe someone has already done something similar and is writing their PhD dissertation on it. andy At 10:30 PM 5/21/01 +0100, Nix wrote:

...

Nix

Monday, 21 May Mon, 21 May

11:26 p.m.

(permalink)

On Mon, 21 May 2001, Andy Piper gibbered:

...

Er, I believe someone has already done something similar and is writing their PhD dissertation on it.

Any idea who? (This probably won't stop me from doing it again; this is at least partially a `learning project' for me, so I'll gain from it even if nobody but me uses it.) -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Adrian Aichner

Monday, 21 May Mon, 21 May

11:33 p.m.

(permalink)

...

>>>> "Andy" == Andy Piper <andyp(a)bea.com> writes:

Andy> Er, I believe someone has already done something similar and is Andy> writing their PhD dissertation on it. richard.reingruber(a)xemacs.org, a student of Mr. Preprocessor. Andy> andy Andy> At 10:30 PM 5/21/01 +0100, Nix wrote:

...

> I'm currently in the middle of a hack of crazy size, jumping in at the > deep end of XEmacs development, in order to familiarize myself with all > the code. > > I'm doing what should have been done a *long* time ago; ditching the > garbage collector and replacing it with a nicer one. I think everyone > can agree that XEmacs's garbage collector currently sucks really rather > hard; I don't think there are any ways to worsen its VM behaviour or > intrusiveness if we tried. > > The one I've picked is the Boehm collector; it's not terribly portable, > but that can be fixed fairly easily --- and it lets us dump bloody GCPRO > and all the uglinesses it has spawned over the years (blocking string > compaction, Lispification in redisplay and many other things). > > The Boehm collector only has one downside that I can see; unportability > (it's not portable to some of the more obscure platforms that XEmacs > runs on). Upsides include: > > - better VM behaviour (mark bits kept in a bitmap, and GC_malloc_atomic() > to state that certain kinds of objects cannot contain pointers). As > it is most of XEmacs is forced to stay memory-resident all the time > by mark bit setting; that proportion will reduce to just those > parts that contain pointers that must be traced. This is my primary > motivation for this because I run XEmacs on some fairly memory-poor > machines and I'm fed up of waiting for bloody GC to finish. > > - mark bit sanity; we can reclaim at least one bit from many objects, > and might even get lightweight cons cells back (I'm not sure if one > bit saved is enough for that though). Certainly we can junk the myriad > different ways we have to mark things; this is very much improved > recently but the boehm-gc can fix it completely > > - (on some platforms) full incrementality/generationality; we can say a > near-total goodbye to GC delays on those platforms > > - the death of GCPRO, and therefore probably also of gc_currently_forbidden > and similar variables. This is my other major motivation; portable > though it is, I *hate* GCPRO with a passion. (Besides, walking the > stack is not that unportable; the unportabilities in the Boehm > collector are in other areas, like incremental collection.) > > - and last but not least, it's written and actively maintained by > someone who's very accessible and who's one of the best GC hackers > there is, and it is itself acknowledged as probably the best > general-purpose C garbage collector in existence. > > Plus, I think it's a fun hack. > > I'm implementing it in such a way that the tying of the specific GC > implementation to Emacs is light; much lighter than the current > one's. So if everyone else thinks the boehm-gc sucks, you can just rip > it out and use the infrastructure left behind. > > > (btw, I think we should under no circumstances implement a copying > collector; they get less and less impressive the bigger the heap and the > longer-lived its objects, and XEmacs has a large heap and a very large > set of long-lived objects, mainly thanks to the obarray.) > > > I'll be paying *no* attention to unexec() and friends in this, at least > at first; I'll keep the portable dumper working, but if unexec() is > totally broken by some interaction with the boehm gc, I will not mourn > unduly. > > I'm doing the changes to 21.4.3 at first, because I know it's pretty > stable otherwise, so that any crashes I may encounter are my fault. Once > I've got it working fairly stably in there, I'll post the patches (and > yes, I'll split the patches up into pieces; I don't want the > GCPRO-removal to drown out the interesting stuff) and let everyone rip > into them... and then I guess I'll forward-port it to 2.5, and everyone > can let GCPRO fade into the mists of memory (maybe with the aid of > a psychiatrist). > > -- > `LARTing lusers is supposed to be satisfying. This is just tedious. The > silly shite I'm doing now is like trying to toothpick to death a Black > Knight made of jelly.' --- RDD

-- Adrian Aichner mailto:adrian＠xemacs.org http://www.xemacs.org/

Michael Livshin

Tuesday, 22 May Tue, 22 May

12:15 a.m.

(permalink)

Nix <nix(a)esperi.demon.co.uk> writes:

...

I'm doing what should have been done a *long* time ago; ditching the garbage collector and replacing it with a nicer one. I think everyone can agree that XEmacs's garbage collector currently sucks really rather hard; I don't think there are any ways to worsen its VM behaviour or intrusiveness if we tried. [...] The one I've picked is the Boehm collector; it's not terribly portable, but that can be fixed fairly easily --- and it lets us dump bloody GCPRO and all the uglinesses it has spawned over the years (blocking string compaction, Lispification in redisplay and many other things). The Boehm collector only has one downside that I can see; unportability (it's not portable to some of the more obscure platforms that XEmacs runs on).

the other downside is its quite pessimal data locality characteristics (it groups objects according to their size -- great for lists, sucks for everything else). whether it matters for XEmacs -- dunno, perhaps not. I've seen one system where it made quite a huge difference, when compared with a custom copying collector.

...

- (on some platforms) full incrementality/generationality; we can say a near-total goodbye to GC delays on those platforms

um, not really. it does help, but the delays are still there, especially if your heap is fragmented (and it will become so, as people tend to run one Emacs instance for weeks).

...

- the death of GCPRO, and therefore probably also of gc_currently_forbidden and similar variables. This is my other major motivation; portable though it is, I *hate* GCPRO with a passion. (Besides, walking the stack is not that unportable; the unportabilities in the Boehm collector are in other areas, like incremental collection.)

the problem with walking the stack is not the unportability, it's the conservatism. you might want to tinker with the code to switch off the data segment scanning, BTW (I don't think it's a run-time option).

...

Plus, I think it's a fun hack.

er. if you consider living in gdb fun, yes. ;)

...

(btw, I think we should under no circumstances implement a copying collector; they get less and less impressive the bigger the heap and the longer-lived its objects, and XEmacs has a large heap and a very large set of long-lived objects, mainly thanks to the obarray.)

last I heard, generational copying collectors were doing quite well. I believe all the major commercial (and most of the free) Common Lisp and *ML systems use such collectors. good luck, --mike -- When your hammer is C++, everything begins to look like a thumb. -- Steve Hoflich on comp.lang.c++

Nix

Tuesday, 22 May Tue, 22 May

1:04 a.m.

(permalink)

On 22 May 2001, Michael Livshin stated:

...

Nix <nix(a)esperi.demon.co.uk> writes: > The Boehm collector only has one downside that I can see; unportability > (it's not portable to some of the more obscure platforms that XEmacs > runs on). the other downside is its quite pessimal data locality characteristics (it groups objects according to their size -- great for lists, sucks for everything else). whether it matters for XEmacs -- dunno, perhaps not.

That has one advantage; it keeps fragmentation down. Maybe this is why that was done. It could well cause cacheline problems as well as data locality problems... ... but it can't fail to have better locality at GC time than the XEmacs GC system; and like it or not right now a busy XEmacs is GC-bound. What does Hans say about this criticism? Knowing Hans he's probably benchmarked it to death :)

...

> - (on some platforms) full incrementality/generationality; we can say a > near-total goodbye to GC delays on those platforms um, not really. it does help, but the delays are still there,

Much much smaller though, because they're cut up :) also, the massive slowdowns caused by GC pulling all of Emacs out of swap even if it's almost never used for anything but GC traversal are improved.

...

especially if your heap is fragmented (and it will become so, as people tend to run one Emacs instance for weeks).

Agreed; the allocator part of the Boehm GC doesn't look ideal as an allocator... I might see if I can rejig it in terms of an underlying malloc(), so we can use the nice general purpose malloc()s to do the allocation, and Boehm to do the GC. (But that's a blue-sky far future project, I think.)

...

> - the death of GCPRO, and therefore probably also of gc_currently_forbidden > and similar variables. This is my other major motivation; portable > though it is, I *hate* GCPRO with a passion. (Besides, walking the the problem with walking the stack is not the unportability, it's the conservatism.

I expect there to be some space wastage from that, yes. But all that wastes is swap, because the unreferenced object is, well, unreferenced; and I'm willing to burn a bit of swap in order to get that better VM behaviour. (It's *horrible* running XEmacs on 16--24Mb machines; the swapping at each GC can take 45 seconds or more!)

...

you might want to tinker with the code to switch off the data segment scanning, BTW (I don't think it's a run-time option).

I'm not sure if 5.3 can do that; there's DATASTART, but messing with that is a bit hairy (OK, it's very nasty 'cos it's one of the nonportable defined-differently-for-everyone options.) I plan to turn off ALL_INTERIOR_POINTERS, if it doesn't kill XEmacs. That should drastically reduce the number of incorrectly retained blocks; and in any case 5.x has got much better at spotting these blocks and zapping them. 4.x was worse, 3.x was horrible :) 5.x isn't bad.

...

> Plus, I think it's a fun hack. er. if you consider living in gdb fun, yes. ;)

It teaches me a lot about XEmacs ('cos I've got to read most of the code to audit it for GC-safety). That's one of my primary goals. :)

...

> (btw, I think we should under no circumstances implement a copying > collector; they get less and less impressive the bigger the heap and the > longer-lived its objects, and XEmacs has a large heap and a very large > set of long-lived objects, mainly thanks to the obarray.) last I heard, generational copying collectors were doing quite well.

Something like Bartlett's Mostly Copying collector? That too looks nice; if the overhead of the Boehm collector is too high, I might well try the Bartlett one instead (note that I'm trying to make the changes to the XEmacs code itself independent of any one GC implementation; this is one reason why). I avoided it for fear that its VM performance would be worse, and because Richard Jones says in his wonderful GC book : A study of the comparative performance of the Boehm-Demers-Weiser : Conservative Collector and the Mostly Copying Collector would be : interesting, although as usual performance would be heavily dependent : on implementation detail. One may speculate that Mostly Copying might : perform better in an environment with a high allocation rate of : short-lived objects [...] and Emacs has lots and lots of objects with very long lifetimes indeed; the exact opposite of this case. (See my original reasoning, above. This just backs it up a little bit.)

...

I believe all the major commercial (and most of the free) Common Lisp and *ML systems use such collectors.

ML in particular has very different allocation patterns to Emacs; huge numbers of tiny, short-lived allocations are common, with very little long-term persistent state. Allocation performance is *really* important in that environment, while what matters here, I think, is GC latency and overhead. -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Martin Buchholz

Tuesday, 22 May Tue, 22 May

5:36 a.m.

(permalink)

...

>>>> "Nix" == nix <nix(a)esperi.demon.co.uk> writes:

Nix> I'm currently in the middle of a hack of crazy size, jumping in at the Nix> deep end of XEmacs development, in order to familiarize myself with all Nix> the code. Richard Reingruber is working on a new GC. It can be found in the CVS branch NewGC-21-2. Richard is quiet. Search for his few postings on xemacs-beta. Richard is a student of Michael Sperber. Yoshiki Hayashi has written a copying gc for xemacs recently. It is not yet ready for prime time. His is precise, i.e. he has to maintain all the gcpros, and actually add a few, since the copying involves finding and updating ALL the places on the stack where Lisp_Objects live. Most of the recent maintenance of the existing gc has been by Olivier Galibert and myself. I've taken a look at the Boehm gc in the past, considering its possible use in XEmacs. I rejected using it because: - it's non-portable (reading its portability layer is frightening) - the concurrent or incremental gc feature isn't even intended to be portable currently, but incremental gc is a key attraction. - The big advantage of boehm gc is ditching the gcpros. But this is only a long-term advantage. Probably xemacs is fairly gcpro-clean these days. Incorporating boehm gc is going to be extreme short-term pain for long-term gain. However, the gcc project has recently started using boehm gc. Since gcc is very portable, they must be making any necessary portability improvements to boehm gc as they go. Nix> I'm doing what should have been done a *long* time ago; ditching the Nix> garbage collector and replacing it with a nicer one. I think everyone Nix> can agree that XEmacs's garbage collector currently sucks really rather Nix> hard; I don't think there are any ways to worsen its VM behaviour or Nix> intrusiveness if we tried. Everyone thinks the current gc sux, yes. Nix> The one I've picked is the Boehm collector; it's not terribly portable, Nix> but that can be fixed fairly easily --- and it lets us dump bloody GCPRO Nix> and all the uglinesses it has spawned over the years (blocking string Nix> compaction, Lispification in redisplay and many other things). Nix> The Boehm collector only has one downside that I can see; unportability Nix> (it's not portable to some of the more obscure platforms that XEmacs Nix> runs on). Upsides include: Nix> - better VM behaviour (mark bits kept in a bitmap, and GC_malloc_atomic() Nix> to state that certain kinds of objects cannot contain pointers). As Nix> it is most of XEmacs is forced to stay memory-resident all the time Nix> by mark bit setting; that proportion will reduce to just those Nix> parts that contain pointers that must be traced. This is my primary Nix> motivation for this because I run XEmacs on some fairly memory-poor Nix> machines and I'm fed up of waiting for bloody GC to finish. Hmmm. It seems to me mark() has to examine most of the heap anyways. Unless you do a lot of work, boehm gc might end up examining even more memory than the existing mark. Most of the guts of existing Lisp_Objects are other Lisp_Objects, so there isn't a whole lot of scope for GC_malloc_atomic() to be a big win. Nix> - mark bit sanity; we can reclaim at least one bit from many objects, Nix> and might even get lightweight cons cells back (I'm not sure if one Nix> bit saved is enough for that though). Certainly we can junk the myriad Nix> different ways we have to mark things; this is very much improved Nix> recently but the boehm-gc can fix it completely we can also put the mark-bits into a bitmap without complete boehmification. Understanding and extracting that code from boehm gc would be a much smaller and still very instructive project. Nix> - and last but not least, it's written and actively maintained by Nix> someone who's very accessible and who's one of the best GC hackers Nix> there is, and it is itself acknowledged as probably the best Nix> general-purpose C garbage collector in existence. I share your admiration of Hans. Nix> I'll be paying *no* attention to unexec() and friends in this, at least Nix> at first; I'll keep the portable dumper working, but if unexec() is Nix> totally broken by some interaction with the boehm gc, I will not mourn Nix> unduly. Agreed. Nix> I'm doing the changes to 21.4.3 at first, because I know it's pretty Nix> stable otherwise, so that any crashes I may encounter are my fault. Once Nix> I've got it working fairly stably in there, I'll post the patches (and Nix> yes, I'll split the patches up into pieces; I don't want the Nix> GCPRO-removal to drown out the interesting stuff) and let everyone rip Nix> into them... and then I guess I'll forward-port it to 2.5, and everyone Nix> can let GCPRO fade into the mists of memory (maybe with the aid of Nix> a psychiatrist). I encourage you on your adventure. However, some advice: - communicate with Richard and Yoshiki. - consider implementing mark bitmaps only in the existing gc. - investigate the version of boehm gc from the gcc project. - of course, you do have the blue book called "Garbage Collection", right? Good Luck.

Olivier Galibert

Tuesday, 22 May Tue, 22 May

5:50 a.m.

(permalink)

On Tue, May 22, 2001 at 01:36:30PM +0900, Martin Buchholz wrote:

...

Most of the recent maintenance of the existing gc has been by Olivier Galibert and myself.

Mostly you. I've been quite inactive lately, as you've probably noticed.

...

Nix> The one I've picked is the Boehm collector; it's not terribly portable, Nix> but that can be fixed fairly easily --- and it lets us dump bloody GCPRO Nix> and all the uglinesses it has spawned over the years (blocking string Nix> compaction, Lispification in redisplay and many other things).

One thing I've heard you should be careful with: conservative GCs tend to "memory leak" with time, finding pointers that aren't really there anymore. They're perfect for short running applications (gcc is a good example), but tend to suck on long running ones. I don't really know how true it is, but you should probably do some research on the subject. Otherwise, as Martin says, good luck :-) OG.

Michael Sperber

Tuesday, 22 May Tue, 22 May

8:06 a.m.

(permalink)

...

>>>> "Martin" == Martin Buchholz <martin(a)xemacs.org> writes:

...

>>>> "Nix" == nix <nix(a)esperi.demon.co.uk> writes:

Nix> I'm currently in the middle of a hack of crazy size, jumping in at the Nix> deep end of XEmacs development, in order to familiarize myself with all Nix> the code. Martin> Richard Reingruber is working on a new GC. It can be found in the Martin> CVS branch NewGC-21-2. Richard is quiet. Search for his few postings Martin> on xemacs-beta. Richard is a student of Michael Sperber. Martin> Yoshiki Hayashi has written a copying gc for xemacs recently. It is Martin> not yet ready for prime time. His is precise, i.e. he has to maintain Martin> all the gcpros, and actually add a few, since the copying involves Martin> finding and updating ALL the places on the stack where Lisp_Objects Martin> live. Let me try to clarify matters: - Robert Pluim <rpluim(a)bigfoot.com> at one time did hook up the Boehm collector to XEmacs in 1999 with rather disappointing results. - Richard has been working on making the current GC replaceable. This is a somewhat different goal from implementing a new GC (which is what Yoshiki Hayashi did). Hopefully, there'll be enough time left from his thesis project (which is finished) to actually plug in a different GC, albeit one he didn't write himself. (Current plans are for the RScheme incremental collector.) The interface is mostly finished; the only thing currently missing is a write barrier. - I rather doubt that Richard will have enough time to integrate the changes of the NewGC branch into the trunk. This would be a very worthwhile goal for anyone working on the GC, no matter which one. - You generally might want to look at http://www-pu.informatik.uni-tuebingen.de/users/sperber/xemacs/next-gener... to read up on some of the issues involved. It's quite dated by now, but most of the stuff there still applies. -- Cheers =8-} Mike Friede, Völkerverständigung und überhaupt blabla

Robert Pluim

Tuesday, 22 May Tue, 22 May

8:39 a.m.

(permalink)

Michael Sperber writes:

...

>>>>> "Martin" == Martin Buchholz <martin(a)xemacs.org> writes: >>>>> "Nix" == nix <nix(a)esperi.demon.co.uk> writes: Nix> I'm currently in the middle of a hack of crazy size, jumping in at the Nix> deep end of XEmacs development, in order to familiarize myself with all Nix> the code. Martin> Richard Reingruber is working on a new GC. It can be found in the Martin> CVS branch NewGC-21-2. Richard is quiet. Search for his few postings Martin> on xemacs-beta. Richard is a student of Michael Sperber. Martin> Yoshiki Hayashi has written a copying gc for xemacs recently. It is Martin> not yet ready for prime time. His is precise, i.e. he has to maintain Martin> all the gcpros, and actually add a few, since the copying involves Martin> finding and updating ALL the places on the stack where Lisp_Objects Martin> live. Let me try to clarify matters: - Robert Pluim <rpluim(a)bigfoot.com> at one time did hook up the Boehm collector to XEmacs in 1999 with rather disappointing results.

That's a mild way of putting it. My notes from that attempt say "Leaks like a piece of string" ;-) The investigation I did seemed to indicate that the incidence of false positives on the stack was rather high (and I remember you mentioning another application using Boehm that had had the same sort of problems. MrEd?).

...

- Richard has been working on making the current GC replaceable. This is a somewhat different goal from implementing a new GC (which is what Yoshiki Hayashi did). Hopefully, there'll be enough time left from his thesis project (which is finished) to actually plug in a different GC, albeit one he didn't write himself. (Current plans are for the RScheme incremental collector.) The interface is mostly finished; the only thing currently missing is a write barrier. - I rather doubt that Richard will have enough time to integrate the changes of the NewGC branch into the trunk. This would be a very worthwhile goal for anyone working on the GC, no matter which one.

Cool. I might tinker with this again if I have time. Robert --

Nix

Wednesday, 23 May Wed, 23 May

10:54 p.m.

(permalink)

On Tue, 22 May 2001, Robert Pluim spake:

...

Michael Sperber writes: > - Robert Pluim <rpluim(a)bigfoot.com> at one time did hook up the Boehm > collector to XEmacs in 1999 with rather disappointing results. That's a mild way of putting it. My notes from that attempt say "Leaks like a piece of string" ;-) The investigation I did seemed to indicate that the incidence of false positives on the stack was rather high

Was ALL_INTERIOR_POINTERS turned off? (If it wasn't, I'd expect to see lots of leakage).

...

> - I rather doubt that Richard will have enough time to integrate the > changes of the NewGC branch into the trunk. This would be a very > worthwhile goal for anyone working on the GC, no matter which one. Cool. I might tinker with this again if I have time.

Too late :) the integration doesn't seem to be very difficult (not *that* much of XEmacs has changed in the interim); before the cold struck I was about 30% of the way through it. I expect to finish it off this weekend. -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Robert Pluim

Monday, 28 May Mon, 28 May

3:31 p.m.

(permalink)

nix(a)esperi.demon.co.uk writes:

...

On Tue, 22 May 2001, Robert Pluim spake: > Michael Sperber writes: > > - Robert Pluim <rpluim(a)bigfoot.com> at one time did hook up the Boehm > > collector to XEmacs in 1999 with rather disappointing results. > > That's a mild way of putting it. My notes from that attempt say > "Leaks like a piece of string" ;-) The investigation I did seemed to > indicate that the incidence of false positives on the stack was rather > high Was ALL_INTERIOR_POINTERS turned off? (If it wasn't, I'd expect to see lots of leakage).

I think it was. Anyway, I hooked boehm 5.3 up to XEmacs 21.5 again last night, which took me all of a half hour, and it still stinks (allocation performance is _terrible_, collection is ok, leakage still seems to occur). I tried to turn on the incremental collector, but that causes crashes (in Lstream code, which I don't understand yet). I don't think this works with --pdump. Tested lightly on i686-pc-linux. gnus works ;-) I'm attaching a diff for anyone interested in playing. I'm sure there's smarter things that can be done to make it work better (I just wholesale turned off the XEmacs collector). You need to add "/where/lib/gc/is/installed/gc.a" to the linker flags for xemacs. Boehm I compiled with the following flags: -O -DATOMIC_UNCOLLECTABLE -DNO_SIGNALS -DNO_EXECUTE_PERMISSION -DREDIRECT_MALLOC=GC_malloc Have fun! Robert Index: src/alloc.c =================================================================== RCS file: /usr/CVSroot/XEmacs/xemacs/src/alloc.c,v retrieving revision 1.51 diff -u -r1.51 alloc.c --- src/alloc.c 2001/05/21 05:26:06 1.51 +++ src/alloc.c 2001/05/28 14:26:29 ＠＠ -3244,6 +3244,7 ＠＠ void garbage_collect_1 (void) { +#if 0 #if MAX_SAVE_STACK > 0 char stack_top_variable; extern char *stack_bottom; ＠＠ -3497,6 +3498,7 ＠＠ UNGCPRO; return; +#endif } /* Debugging aids. */ ＠＠ -3530,6 +3532,7 ＠＠ */ ()) { +#if 0 Lisp_Object pl = Qnil; unsigned int i; int gc_count_vector_total_size = 0; ＠＠ -3620,7 +3623,6 ＠＠ HACK_O_MATIC (cons, "cons-storage", pl); pl = gc_plist_hack ("conses-free", gc_count_num_cons_freelist, pl); pl = gc_plist_hack ("conses-used", gc_count_num_cons_in_use, pl); - /* The things we do for backwards-compatibility */ return list6 (Fcons (make_int (gc_count_num_cons_in_use), ＠＠ -3632,6 +3634,8 ＠＠ make_int (gc_count_string_total_size), make_int (gc_count_vector_total_size), pl); +#endif + return Qt; } #undef HACK_O_MATIC Index: src/emacs.c =================================================================== RCS file: /usr/CVSroot/XEmacs/xemacs/src/emacs.c,v retrieving revision 1.99 diff -u -r1.99 emacs.c --- src/emacs.c 2001/05/10 09:59:56 1.99 +++ src/emacs.c 2001/05/28 14:26:30 ＠＠ -3106,8 +3106,10 ＠＠ conversion is applied everywhere. Don't worry about memory leakage because this call only happens once. */ unexec (filename_ext, symfile_ext, (uintptr_t) my_edata, 0, 0); +#if 0 #ifdef DOUG_LEA_MALLOC free (malloc_state_ptr); +#endif #endif #endif /* not PDUMP */ } Index: src/lisp.h =================================================================== RCS file: /usr/CVSroot/XEmacs/xemacs/src/lisp.h,v retrieving revision 1.51 diff -u -r1.51 lisp.h --- src/lisp.h 2001/05/18 04:39:42 1.51 +++ src/lisp.h 2001/05/28 14:26:31 ＠＠ -1938,6 +1938,7 ＠＠ NNGCPROn(). If you need to nest yet another level, create the appropriate macros. */ +#if 0 #ifdef DEBUG_GCPRO void debug_gcpro1 (char *, int, struct gcpro *, Lisp_Object *); ＠＠ -2097,6 +2098,25 ＠＠ #define NNUNGCPRO ((void) (gcprolist = nngcpro1.next)) #endif /* ! DEBUG_GCPRO */ +#endif +#define GCPRO1(v) +#define GCPRO2(v1,v2) +#define GCPRO3(v1,v2,v3) +#define GCPRO4(v1,v2,v3,v4) +#define GCPRO5(v1,v2,v3,v4,v5) +#define UNGCPRO +#define NGCPRO1(v) +#define NGCPRO2(v1,v2) +#define NGCPRO3(v1,v2,v3) +#define NGCPRO4(v1,v2,v3,v4) +#define NGCPRO5(v1,v2,v3,v4,v5) +#define NUNGCPRO +#define NNGCPRO1(v) +#define NNGCPRO2(v1,v2) +#define NNGCPRO3(v1,v2,v3) +#define NNGCPRO4(v1,v2,v3,v4) +#define NNGCPRO5(v1,v2,v3,v4,v5) +#define NNUNGCPRO /* Another try to fix SunPro C compiler warnings */ /* "end-of-loop code not reached" */ --

Nix

Tuesday, 22 May Tue, 22 May

8:43 a.m.

(permalink)

On Tue, 22 May 2001, Martin Buchholz stated:

...

Richard Reingruber is working on a new GC. It can be found in the CVS branch NewGC-21-2. Richard is quiet. Search for his few postings on xemacs-beta. Richard is a student of Michael Sperber.

It seems that it needs a bit of a forward port, at least :)

...

Yoshiki Hayashi has written a copying gc for xemacs recently. It is not yet ready for prime time. His is precise, i.e. he has to maintain all the gcpros, and actually add a few, since the copying involves finding and updating ALL the places on the stack where Lisp_Objects live.

The real problem with GCPROs is that they are nasty to maintain; the type-accuracy they provide is good. It occurs to me that I could make GCPROs maintainable by making all the alloc_*() routines do an automatic GCPRO (well, do the same thing to the gcpro roots that GCPRO now does), and could write a fairly simple preprocessor that takes (one file of) the XEmacs C code and spits out a (transient) source file with UNGCPRO-alikes inserted at the end of each block where they are allocated. That would let us dump visible GCPROs in the source yet keep type-accuracy...

...

I've taken a look at the Boehm gc in the past, considering its possible use in XEmacs. I rejected using it because: - it's non-portable (reading its portability layer is frightening)

That's my biggest worry, too. I don't really mind which GC I go to, really; my primary goals are zapping the GCPRO construct as it currently exists (although keeping its type-accuracy would be nice) and getting something with better VM behaviour. Both of these can be done without actually replacing the GC; it just seemed simpler to do that.

...

However, the gcc project has recently started using boehm gc. Since gcc is very portable, they must be making any necessary portability improvements to boehm gc as they go.

Not yet. There are many architectures that GCC works on that libjava and parts of libobjc (the primary users of that gc) do not work on, and the boehm-gc in the GCC tree is a rather aged implementation. When version 6 comes out Hans has said he will move to making the GCC tree's copy of the collector the master copy; if I'm still using the boehm-gc in XEmacs by then, I'll migrate it to that version.

...

Hmmm. It seems to me mark() has to examine most of the heap anyways. Unless you do a lot of work, boehm gc might end up examining even more memory than the existing mark. Most of the guts of existing Lisp_Objects are other Lisp_Objects, so there isn't a whole lot of scope for GC_malloc_atomic() to be a big win.

This isn't true of anything except for cons cells and other sequences, and things like hash tables, is it? For `leaf' objects (string data &c), a conservative collector like Boehm's would wind up tracing them for pointers they couldn't possibly contain pointers to anything. (It's true that the gains from this bit over the *existing* allocator would be marginal; it reduces the negative effects of using a conservative collector substantially, though. It's a tradeoff; GCPRO versus the occasional malloc_atomic()... but the more I think about it the more it seems to me that a preprocessor that automatically inserts UNGCPROs is the right way to go; type-accuracy *and* maintainability, wow ;) )

...

we can also put the mark-bits into a bitmap without complete boehmification. Understanding and extracting that code from boehm gc would be a much smaller and still very instructive project.

True. I think, for the first implementation, I'll go with -- Richard's infrastructure (general modifications), forward-ported to 21.4; they look praiseworthy, and the forward-porting job doesn't look too enormous -- A bitmapped version of the current GC -- and, if I can get it to work --- and I damned well will ;) --- a GCPROizing preprocessor, so we can dump *visible* evidence of GCPRO (that being what does the harm; we can forget it exists completely if it is automatically maintained!) (I might even be able to make it so that only one GCPRO/UNGCPRO operation needs to be done for an entire stack frame, because the preprocessor knows exactly what Lisp_Objects have been allocated in a given frame, so it can insert a single operation that does all the work...)

...

I encourage you on your adventure. However, some advice: - communicate with Richard and Yoshiki. - consider implementing mark bitmaps only in the existing gc.

Agreed and agreed (see above).

...

- investigate the version of boehm gc from the gcc project.

Absolutely.

...

- of course, you do have the blue book called "Garbage Collection", right?

Of course! Wonderful book it is, too. (I originally bought it many moons ago with the explicit intention of using it to improve XEmacs's GC, but it's taken me this long to assimilate it properly.) -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Michael Sperber

Tuesday, 22 May Tue, 22 May

9:14 a.m.

(permalink)

...

>>>> "Nix" == Nix <nix(a)esperi.demon.co.uk> writes:

Nix> The real problem with GCPROs is that they are nasty to maintain; the Nix> type-accuracy they provide is good. Yeah, originally Richard and I were thinking of implementing a preprocessor to automatically put in the GCPROs. Some pervasive changes in the code are still necessary to make it work, but we know how it should be done in principle. We'd be happy to discuss what's involved. -- Cheers =8-} Mike Friede, Völkerverständigung und überhaupt blabla

Martin Buchholz

Tuesday, 22 May Tue, 22 May

10:47 a.m.

(permalink)

...

>>>> "Nix" == nix <nix(a)esperi.demon.co.uk> writes:

Nix> The real problem with GCPROs is that they are nasty to maintain; the Nix> type-accuracy they provide is good. It occurs to me that I could make Nix> GCPROs maintainable by making all the alloc_*() routines do an automatic Nix> GCPRO (well, do the same thing to the gcpro roots that GCPRO now does), The alloc routines know what they're allocating, but they don't know about the references to the memory being allocated. Unless you do something very radical, you have to gcpro the address of the Lisp_Object on the stack that is about to receive the pointer to the allocated memory. This means that Fcons can't gcpro anything. Perhaps I'm missing something. The standard tricky thing about implementing a preprocessor that adds gcpros automatically is that sometimes the Lisp_Objects that have to be protected aren't even declared - they're temporaries. For example return Fcons_user1 (Fcons_user2 (Fcons (Qnil, Qnil))); This has to be expanded into { Lisp_Object x = Qnil, y = Qnil, z = Qnil; struct gcpro gcpro1, gcpro2, gcpro3; GCPRO3 (x, y, z); x = Fcons (Qnil, Qnil); y = Fcons_user2 (x); z = Fcons_user3 (y); UNGCPRO; return z; } (this example could of course be optimized - but this is the obvious translation) (this example also shows how incredibly ugly gcpros are) The current source code does a lot fewer GCPROs than the above, because of detailed knowledge of various functions (can they gc, or could they return a fresh object). It is possible to determine this kind of information from a global analysis of the source code, but it won't be easy. In particular, the C preprocessor will tend to make the source more resistant to automated understanding by your gcpro preprocessor. E.g. #ifdef MULE #define FROB(x) Fcons_user1 (Fcons_user2 (Fcons (x, x))) #else #define FROB(x) Fcons_user3 (Fcons_user4 (Fcons (x, x))) #endif ... return FROB(Qnil); Nix> and could write a fairly simple preprocessor that takes (one file of) Nix> the XEmacs C code and spits out a (transient) source file with Nix> UNGCPRO-alikes inserted at the end of each block where they are Nix> allocated. That would let us dump visible GCPROs in the source yet keep Nix> type-accuracy...

...

> I've taken a look at the Boehm gc in the past, considering its > possible use in XEmacs. I rejected using it because: > - it's non-portable (reading its portability layer is frightening)

Nix> That's my biggest worry, too. I don't really mind which GC I go to, Nix> really; my primary goals are zapping the GCPRO construct as it currently Nix> exists (although keeping its type-accuracy would be nice) and getting Nix> something with better VM behaviour. Both of these can be done without Nix> actually replacing the GC; it just seemed simpler to do that.

...

> However, the gcc project has recently started using boehm gc. Since > gcc is very portable, they must be making any necessary portability > improvements to boehm gc as they go.

Nix> Not yet. There are many architectures that GCC works on that libjava and Nix> parts of libobjc (the primary users of that gc) do not work on, and the Nix> boehm-gc in the GCC tree is a rather aged implementation. Bad news. Nix> When version 6 comes out Hans has said he will move to making the GCC Nix> tree's copy of the collector the master copy; if I'm still using the Nix> boehm-gc in XEmacs by then, I'll migrate it to that version. Very good news.

...

> Hmmm. It seems to me mark() has to examine most of the heap anyways. > Unless you do a lot of work, boehm gc might end up examining even more > memory than the existing mark. Most of the guts of existing > Lisp_Objects are other Lisp_Objects, so there isn't a whole lot of > scope for GC_malloc_atomic() to be a big win.

Nix> This isn't true of anything except for cons cells and other sequences, Nix> and things like hash tables, is it? For `leaf' objects (string data &c), Nix> a conservative collector like Boehm's would wind up tracing them for Nix> pointers they couldn't possibly contain pointers to anything. Nix> (It's true that the gains from this bit over the *existing* allocator Nix> would be marginal; it reduces the negative effects of using a Nix> conservative collector substantially, though. It's a tradeoff; GCPRO Nix> versus the occasional malloc_atomic()... but the more I think about it Nix> the more it seems to me that a preprocessor that automatically inserts Nix> UNGCPROs is the right way to go; type-accuracy *and* maintainability, Nix> wow ;) ) Sorry, I meant there's little scope for a Boehm mark() to be better than the existing mark() because the existing one, via lisp object type mark methods, already knows exactly which object components might contain other Lisp_Objects. The mark bitmap won't help you much during the mark phase, in fact it ought to make things worse. The place where the Boehm gc shines performance wise is not during gc at all, but in the fact that gcpros can be dispensed with entirely while the mutator runs. Normally the advantage of mark&sweep over reference counting is: not slowing down the mutator. But gcproing does slow down the mutator. But I really don't know how much overhead we have from GCPROing, and it's hard to measure. My random guess is that 10% of the runtime of lisp function calls that do no work (i.e. return immediately) is taken up with gcproing.

...

> we can also put the mark-bits into a bitmap without complete > boehmification. Understanding and extracting that code from boehm gc > would be a much smaller and still very instructive project.

Nix> True. I think, for the first implementation, I'll go with Nix> -- Richard's infrastructure (general modifications), forward-ported to Nix> 21.4; they look praiseworthy, and the forward-porting job doesn't Nix> look too enormous Nix> -- A bitmapped version of the current GC Nix> -- and, if I can get it to work --- and I damned well will ;) --- a Nix> GCPROizing preprocessor, so we can dump *visible* evidence of GCPRO Nix> (that being what does the harm; we can forget it exists completely Nix> if it is automatically maintained!) Harder than it looks, as I try to point out above. I'll be very impressed if you can produce a reliable GCPROizing preprocessor. I think you'll have to run the source through the C preprocessor first, which means your gcproizer has to run every build. Using the standard Unix utilities, this will be hard (i.e. you'll want to use something like perl or python). Most free software projects don't have such dependencies for non-maintainer-mode, but I think the time might be right to introduce such a dependency for xemacs. Nix> (I might even be able to make it so that only one GCPRO/UNGCPRO Nix> operation needs to be done for an entire stack frame, because the Nix> preprocessor knows exactly what Lisp_Objects have been allocated in Nix> a given frame, so it can insert a single operation that does all the Nix> work...) I like your plan very much. ObAdvice: Have you read the internals manual section that deals with gc?

Michael Sperber

Tuesday, 22 May Tue, 22 May

10:58 a.m.

(permalink)

...

>>>> "Martin" == Martin Buchholz <martin(a)xemacs.org> writes:

Martin> The place where the Boehm gc shines performance wise is not during gc Martin> at all, but in the fact that gcpros can be dispensed with entirely Martin> while the mutator runs. Normally the advantage of mark&sweep over Martin> reference counting is: not slowing down the mutator. Erh, yeah, well, at the cost of introducing an entire additional phase to the execution. I'm sure GCPRO is not a very relevant factor during runtime. Garbage collection is. Adding a bunch of GCPRO's is not going to make much of a difference to execution time. That difference should be more than offset by the speed gain from a decent collector. You can't get much slower than the one in Emacs. -- Cheers =8-} Mike Friede, Völkerverständigung und überhaupt blabla

Michael Sperber

Tuesday, 22 May Tue, 22 May

11 a.m.

(permalink)

...

>>>> "Martin" == Martin Buchholz <martin(a)xemacs.org> writes:

Martin> I think you'll have to run the source through the C preprocessor Martin> first, which means your gcproizer has to run every build. Using the Martin> standard Unix utilities, this will be hard (i.e. you'll want to use Martin> something like perl or python). Most free software projects don't Martin> have such dependencies for non-maintainer-mode, But we have plenty of library dependencies which are actually harder to satisfy in a lot of setups. Martin> but I think the time might be right to introduce such a Martin> dependency for xemacs. Yes, but preferably not on Perl or Python. You want an actual C frontend for this. I already have most of one lying around here for the job. -- Cheers =8-} Mike Friede, Völkerverständigung und überhaupt blabla

Martin Buchholz

Tuesday, 22 May Tue, 22 May

11:24 a.m.

(permalink)

...

>>>> "MS" == Michael Sperber <sperber(a)informatik.uni-tuebingen.de> writes:

...

>>>> "Martin" == Martin Buchholz <martin(a)xemacs.org> writes:

Martin> I think you'll have to run the source through the C preprocessor Martin> first, which means your gcproizer has to run every build. Using the Martin> standard Unix utilities, this will be hard (i.e. you'll want to use Martin> something like perl or python). Most free software projects don't Martin> have such dependencies for non-maintainer-mode, MS> But we have plenty of library dependencies which are actually harder MS> to satisfy in a lot of setups. The library dependencies should mostly be optional. E.g. X11 is optional. Martin> but I think the time might be right to introduce such a Martin> dependency for xemacs. MS> Yes, but preferably not on Perl or Python. You want an actual C MS> frontend for this. I already have most of one lying around here for MS> the job. Writing it in C is the Right Thing - portable and fast. But also difficult. If you already have one, use it, share it.

Nix

Wednesday, 23 May Wed, 23 May

9:52 p.m.

(permalink)

[Sorry for the delay; a cold. I didn't feel I could respond to this until I was capable of thinking again...] On Tue, 22 May 2001, Martin Buchholz said:

...

The alloc routines know what they're allocating, but they don't know about the references to the memory being allocated. Unless you do something very radical, you have to gcpro the address of the Lisp_Object on the stack that is about to receive the pointer to the allocated memory. This means that Fcons can't gcpro anything.

Gaah. You're right, of course. (Stupid of me.)

...

Perhaps I'm missing something.

No, of course not; but this doesn't add much complexity to what I'm planning --- which probably indicates that it's hugely overcomplicated already :)

...

The standard tricky thing about implementing a preprocessor that adds gcpros automatically is that sometimes the Lisp_Objects that have to be protected aren't even declared - they're temporaries.

Absolutely.

...

For example return Fcons_user1 (Fcons_user2 (Fcons (Qnil, Qnil))); This has to be expanded into { Lisp_Object x = Qnil, y = Qnil, z = Qnil; struct gcpro gcpro1, gcpro2, gcpro3; GCPRO3 (x, y, z); x = Fcons (Qnil, Qnil); y = Fcons_user2 (x); z = Fcons_user3 (y); UNGCPRO; return z; }

This was almost exactly the transformation I was expecting to have to apply, although I intend that the GCPRO and UNGCPRO macros don't survive in their current form with this preprocessor; the limits on the numbers of objects that can be GCPROed in one block *will* go away, for instance; it irks me. (Of course their replacements will do the same thing, and will probably be the same code, just emitted already expanded.)

...

(this example could of course be optimized - but this is the obvious translation)

... and probably the one I'd apply.

...

(this example also shows how incredibly ugly gcpros are)

A good reason to automate away their creation then, so no humans have to look at them :)

...

The current source code does a lot fewer GCPROs than the above, because of detailed knowledge of various functions (can they gc, or

This is a very bad idea, I think. It means that much of the XEmacs source code has detailed knowledge of the internal implementation details of other parts of the source code; this reduces stability ('cos some *will* be missed) and drastically increases the difficulty of changing anything. I think this should be handled in a similar way to the way GCC works out what registers are going to be used in subfunctions; assume they'll all be clobbered. The expenditure of time is minimal, in any case. (If the time wastage gets extreme in some hot spots, we can optimize the hot spots; but in general I think that every stack slot that holds a Lisp_Object should be protected. XEmacs isn't GCPRO-bound, it's GC-bound ;) )

...

could they return a fresh object). It is possible to determine this kind of information from a global analysis of the source code, but it

I'm not even going to *try* to do that. It's probably impossible in the general case, anyway (I'd need to think about it, but I'd be surprised if that kind of detail of global analysis didn't reduce to the halting problem.)

...

won't be easy. In particular, the C preprocessor will tend to make the source more resistant to automated understanding by your gcpro preprocessor. E.g.

I'm going to stitch this in after the preprocessor has run on a given source file. (I'd have to preprocess it anyway, so it seems sensible to only preprocess it once!) Oh, FWIW the preprocessor should be able to handle just about anything, 'cos I'm cheating. I'm taking c-parse.y and cpplex.c from GCC, and tearing them down into something that can unambiguously spot C blocks (`stmts_and_decls'), assignments (`expr_no_commas'), Lisp_Object variable declarations (`decl' nodes), and temporaries. Everything else that I can throw away I will; we don't need a complete C parser, let alone the Objective C parts (sorry, Ovidiu ;) ). In one respect it'll be more complicated than GCC's parser; it needs to maintain knowledge of where in the pre-lexed sources a given lexical component comes from, so we can rewrite the temporaries; but it shouldn't be terribly hard. (I think GCC also contains some temporary-rewriting stuff I can pinch from; building a temporary is an RTL-level equivalent of what this preprocessor would have to do on the source code level. Fun. (But if this were C++ it would not be fun, it would be torment...)

...

Sorry, I meant there's little scope for a Boehm mark() to be better than the existing mark() because the existing one, via lisp object type mark methods, already knows exactly which object components might contain other Lisp_Objects. The mark bitmap won't help you much during the mark phase, in fact it ought to make things worse.

Hmm. Why is that? The current garbage collector has appalling VM behaviour; every Lisp_Object is sucked back into memory, even if they cannot contain other Lisp_Objects. If a mark bitmap is used, a Lisp_Object will only be accessed for pointer tracing; if it cannot contain another Lisp_Object, it won't be touched. I'll admit that I haven't instrumented the existing gc to find out what percentage of objects are accessed only to set the mark bit; I'll do so as soon as this cold has worn off. (I'll feel a right idiot if the answer is something like 1%, too...)

...

The place where the Boehm gc shines performance wise is not during gc at all, but in the fact that gcpros can be dispensed with entirely while the mutator runs. Normally the advantage of mark&sweep over reference counting is: not slowing down the mutator. But gcproing does slow down the mutator. But I really don't know how much overhead we have from GCPROing, and it's hard to measure. My random guess is that 10% of the runtime of lisp function calls that do no work (i.e. return immediately) is taken up with gcproing.

However, a (not huge) general slowdown like that is hard for humans to spot. What humans definitely *do* spot (well, I know I've spotted it, and so have friends of mine, new XEmacs hands and old) is the total stoppage we get whenever GCing runs and the world freezes except for the hammering of the disk :( if XEmacs is totally in swap, it can be frozen for a minute or more while GC runs. GCPROs suck, but the only ways we can eliminate them are conservative collection (not very portable and can leave cruft around, as you said), some really hefty global analysis (I'll leave that for a braver soul), or a compiler that knows how to emit typing information (and while we could probably get an option added to GCC that would emit such information, I don't think we can decree that XEmacs can only be built by GCC-3.1 and above...) Given Moore's Law and the fact that machines with small amounts of memory are getting steadily harder to find and steadily harder to run late versions of XEmacs on (and probably don't see many upgrades anyway), I am happy to *increase* the number of GCPROs substantially, to ensure that we can freely modify arbitrary functions without triggering GC hell. After all, GCPROs are really quite efficiently implemented; it's not as if they call malloc() or something like that. (As a *structure*, the gcprolist is praiseworthy. It's just the manual way it's kept updated that's disgusting and restrictive.)

...

Nix> -- and, if I can get it to work --- and I damned well will ;) --- a Nix> GCPROizing preprocessor, so we can dump *visible* evidence of GCPRO Nix> (that being what does the harm; we can forget it exists completely Nix> if it is automatically maintained!) Harder than it looks, as I try to point out above. I'll be very impressed if you can produce a reliable GCPROizing preprocessor.

We have the advantage of a pre-existing free software project that is quite capable of understanding C code (GCC), and which does far more elaborate transformations than this preprocessor will have to; and I'm happy to nick code from it. (I'll probably mention such large-scale borrowing on the GCC list if I do it, if Ovidiu doesn't beat me to it...) The GCC C parser is actually quite neat, as such things go...

...

I think you'll have to run the source through the C preprocessor first, which means your gcproizer has to run every build. Using the

Yes.

...

standard Unix utilities, this will be hard (i.e. you'll want to use something like perl or python). Most free software projects don't

Not a chance; this'll be in C. Parsing is what yacc is good for, and as usual we'd ship the results of running yacc on the preprocessor's .y files, so the builders won't need yacc. (Plus, GCC is already written in C, and while I'm willing to attack c-parse.y and cpplex with a stone axe to form the core of this preprocessor, I'm really not willing to translate them into Perl; I have more taste than that ;) )

...

have such dependencies for non-maintainer-mode, but I think the time might be right to introduce such a dependency for xemacs.

No need. (I wouldn't want that dependency for purely personal reasons; one of the sites I run xemacs on has no perl or python, and I can't install either because I just barely have the space for xemacs. I know where *my* priorities lie ;) )

...

ObAdvice: Have you read the internals manual section that deals with gc?

Yes, of course, long before I mentioned this plan here, when I harboured the secret desire that the fix to XEmacs's GC performance would be simple. (Then I read of GCPRO and felt quite ill...) -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Mike Alexander

Wednesday, 23 May Wed, 23 May

10:59 p.m.

(permalink)

--On Wednesday, May 23, 2001 9:52 PM +0100 Nix <nix(a)esperi.demon.co.uk> wrote:

...

Yes, I can imagine it would be. I would certainly hate to do anything that would heavily penalize the use of C++ in XEmacs or worse yet, make it impossible. Also don't forget about the native Windows build using MSVC while thinking about this. Nothing you've said so far jumps out as being impossible, but that's a rather different environment with a different C compiler and preprocessor to worry about. Mike Alexander mailto:mta＠arbortext.com Arbortext, Inc. +1-734-997-0200

Nix

Wednesday, 23 May Wed, 23 May

11:24 p.m.

(permalink)

On Wed, 23 May 2001, Mike Alexander stipulated:

...

--On Wednesday, May 23, 2001 9:52 PM +0100 Nix <nix(a)esperi.demon.co.uk> wrote: > Fun. (But if this were C++ it would not be fun, it would be > torment...) > Yes, I can imagine it would be. I would certainly hate to do anything that would heavily penalize the use of C++ in XEmacs or worse yet, make it impossible.

The CodeSourcery folks are rewriting the C++ parser into something that's actually maintainable; when that's done I could produce a C++ variant of the preprocessor (let's give it a name: gcpp. The C++ variant would presumably be gcpppp...)

...

Also don't forget about the native Windows build using MSVC while thinking about this. Nothing you've said so far jumps out as being impossible, but that's a rather different environment with a different C compiler and preprocessor to worry about.

That's irrelevant; this preprocessor's not trying to compile the code, just to pick out declarations and function calls and do a little simple rewriting. As long as the C used in the MSVC build (and headers) isn't gratutiously nonstandard, the parser should be happy. (And if it isn't I can rejig it until it is; MS's changes to C can't be *that* bad, can they? Unless they've broken block structure or declarations...) (Aside: I'm getting a little worried by the length of this cc list...) -- `LARTing lusers is supposed to be satisfying. This is just tedious. The silly shite I'm doing now is like trying to toothpick to death a Black Knight made of jelly.' --- RDD

Martin Buchholz

Thursday, 24 May Thu, 24 May

3:58 a.m.

(permalink)

...

>>>> "Nix" == nix <nix(a)esperi.demon.co.uk> writes:

Nix> Oh, FWIW the preprocessor should be able to handle just about anything, Nix> 'cos I'm cheating. I'm taking c-parse.y and cpplex.c from GCC, and Nix> tearing them down into something that can unambiguously spot C blocks Nix> (`stmts_and_decls'), assignments (`expr_no_commas'), Lisp_Object Nix> variable declarations (`decl' nodes), and temporaries. Everything else Nix> that I can throw away I will; we don't need a complete C parser, let Nix> alone the Objective C parts (sorry, Ovidiu ;) ). We already have a makefile rule for running the C preprocessor: make foo.i should work (the rule might not be portable to every C compiler). You'll want to run your C parser on the output of that. Nix> Fun. (But if this were C++ it would not be fun, it would be torment...) Ahhh, but this _is_ C++. The production xemacs I use is compiled by a C++ compiler. And the road is open to someone to take xemacs in a purely C++ direction, for example to create Qt-xemacs.

...

> Sorry, I meant there's little scope for a Boehm mark() to be better > than the existing mark() because the existing one, via lisp object > type mark methods, already knows exactly which object components might > contain other Lisp_Objects. The mark bitmap won't help you much > during the mark phase, in fact it ought to make things worse.

Nix> Hmm. Why is that? The current garbage collector has appalling VM Nix> behaviour; every Lisp_Object is sucked back into memory, even if they Nix> cannot contain other Lisp_Objects. If a mark bitmap is used, a Nix> Lisp_Object will only be accessed for pointer tracing; if it cannot Nix> contain another Lisp_Object, it won't be touched. Oh, you mean that Lisp object types not containing any Lisp_Objects as members are accessed by the current gc to access the mark bit in the lrecord_header? That's true, but currently, also the type and the c_readonly members need to be accessed. If you move *only* the mark bits to a separate bitmap, you still need to determine the type of each object. And even if you don't have to read the lrecord_header of lisp objects that don't reference other lisp objects, I think you won't win very much, because MOST lisp objects do reference other lisp objects. You can look at the output of (garbage-collect) to figure out what the potential savings are. Nix> I'll admit that I haven't instrumented the existing gc to find out what Nix> percentage of objects are accessed only to set the mark bit; I'll do so Nix> as soon as this cold has worn off. (I'll feel a right idiot if the Nix> answer is something like 1%, too...) Programmers are notoriously poor at estimating performance questions. That also means you shouldn't trust my pessimism above.

...

> The place where the Boehm gc shines performance wise is not during gc > at all, but in the fact that gcpros can be dispensed with entirely > while the mutator runs. Normally the advantage of mark&sweep over > reference counting is: not slowing down the mutator. But gcproing does > slow down the mutator. But I really don't know how much overhead we > have from GCPROing, and it's hard to measure. My random guess is that > 10% of the runtime of lisp function calls that do no work (i.e. return > immediately) is taken up with gcproing.

Nix> However, a (not huge) general slowdown like that is hard for humans to Nix> spot. What humans definitely *do* spot (well, I know I've spotted it, Nix> and so have friends of mine, new XEmacs hands and old) is the total Nix> stoppage we get whenever GCing runs and the world freezes except for the Nix> hammering of the disk :( if XEmacs is totally in swap, it can be frozen Nix> for a minute or more while GC runs. I don't see this happen. Your machine has to have enough memory to run xemacs - it's a memory hog. Better ways to fix this (the "resident set size problem") are: - load less lisp. - load lisp more lazily. (or gasp, unload lisp after it's not been used for a while... think of loading lisp as "paging in") - generational gc. Nix> Given Moore's Law and the fact that machines with small amounts of Nix> memory are getting steadily harder to find and steadily harder to run Nix> late versions of XEmacs on (and probably don't see many upgrades Nix> anyway), I am happy to *increase* the number of GCPROs substantially, to Nix> ensure that we can freely modify arbitrary functions without triggering Nix> GC hell. After all, GCPROs are really quite efficiently implemented; You're complaining about how slow xemacs is, and also "it's ok for me to make it a little slower". Nix> it's not as if they call malloc() or something like that. (As a Nix> *structure*, the gcprolist is praiseworthy. It's just the manual way Nix> it's kept updated that's disgusting and restrictive.) Offhand, it bothers me that most struct gcpro protect just one variable, so much of the time the initialization and reading of the nvars member is wasted. But I don't know whether this can be easily optimized away. Could one have a gcprolist with only `simple' variables, and a multiple_gcprolist that can protect arrays of variables? Nix> (Then I read of GCPRO and felt quite ill...) There are other health-sapping critters lurking in the xemacs source code...

Yoshiki Hayashi

Thursday, 24 May Thu, 24 May

6:54 a.m.

(permalink)

Martin Buchholz <martin(a)xemacs.org> writes:

...

Nix> Given Moore's Law and the fact that machines with small amounts of Nix> memory are getting steadily harder to find and steadily harder to run Nix> late versions of XEmacs on (and probably don't see many upgrades Nix> anyway), I am happy to *increase* the number of GCPROs substantially, to Nix> ensure that we can freely modify arbitrary functions without triggering Nix> GC hell. After all, GCPROs are really quite efficiently implemented; You're complaining about how slow xemacs is, and also "it's ok for me to make it a little slower".

It's OK to make XEmacs a little bit slower but eliminate disruptive pause is the reason behind adopting incremental GC. -- Yoshiki Hayashi

Ben Wing

Saturday, 26 May Sat, 26 May

9:19 a.m.

(permalink)

Martin Buchholz wrote:

...

>>>>> "Nix" == nix <nix(a)esperi.demon.co.uk> writes: Nix> Oh, FWIW the preprocessor should be able to handle just about anything, Nix> 'cos I'm cheating. I'm taking c-parse.y and cpplex.c from GCC, and Nix> tearing them down into something that can unambiguously spot C blocks Nix> (`stmts_and_decls'), assignments (`expr_no_commas'), Lisp_Object Nix> variable declarations (`decl' nodes), and temporaries. Everything else Nix> that I can throw away I will; we don't need a complete C parser, let Nix> alone the Objective C parts (sorry, Ovidiu ;) ). We already have a makefile rule for running the C preprocessor: make foo.i should work (the rule might not be portable to every C compiler). You'll want to run your C parser on the output of that. Nix> Fun. (But if this were C++ it would not be fun, it would be torment...) Ahhh, but this _is_ C++. The production xemacs I use is compiled by a C++ compiler. And the road is open to someone to take xemacs in a purely C++ direction, for example to create Qt-xemacs. >> Sorry, I meant there's little scope for a Boehm mark() to be better >> than the existing mark() because the existing one, via lisp object >> type mark methods, already knows exactly which object components might >> contain other Lisp_Objects. The mark bitmap won't help you much >> during the mark phase, in fact it ought to make things worse. Nix> Hmm. Why is that? The current garbage collector has appalling VM Nix> behaviour; every Lisp_Object is sucked back into memory, even if they Nix> cannot contain other Lisp_Objects. If a mark bitmap is used, a Nix> Lisp_Object will only be accessed for pointer tracing; if it cannot Nix> contain another Lisp_Object, it won't be touched. Oh, you mean that Lisp object types not containing any Lisp_Objects as members are accessed by the current gc to access the mark bit in the lrecord_header? That's true, but currently, also the type and the c_readonly members need to be accessed. If you move *only* the mark bits to a separate bitmap, you still need to determine the type of each object. And even if you don't have to read the lrecord_header of lisp objects that don't reference other lisp objects, I think you won't win very much, because MOST lisp objects do reference other lisp objects. You can look at the output of (garbage-collect) to figure out what the potential savings are. Nix> I'll admit that I haven't instrumented the existing gc to find out what Nix> percentage of objects are accessed only to set the mark bit; I'll do so Nix> as soon as this cold has worn off. (I'll feel a right idiot if the Nix> answer is something like 1%, too...) Programmers are notoriously poor at estimating performance questions. That also means you shouldn't trust my pessimism above. >> The place where the Boehm gc shines performance wise is not during gc >> at all, but in the fact that gcpros can be dispensed with entirely >> while the mutator runs. Normally the advantage of mark&sweep over >> reference counting is: not slowing down the mutator. But gcproing does >> slow down the mutator. But I really don't know how much overhead we >> have from GCPROing, and it's hard to measure. My random guess is that >> 10% of the runtime of lisp function calls that do no work (i.e. return >> immediately) is taken up with gcproing. Nix> However, a (not huge) general slowdown like that is hard for humans to Nix> spot. What humans definitely *do* spot (well, I know I've spotted it, Nix> and so have friends of mine, new XEmacs hands and old) is the total Nix> stoppage we get whenever GCing runs and the world freezes except for the Nix> hammering of the disk :( if XEmacs is totally in swap, it can be frozen Nix> for a minute or more while GC runs. I don't see this happen. Your machine has to have enough memory to run xemacs - it's a memory hog. Better ways to fix this (the "resident set size problem") are: - load less lisp. - load lisp more lazily. (or gasp, unload lisp after it's not been used for a while... think of loading lisp as "paging in") - generational gc. Nix> Given Moore's Law and the fact that machines with small amounts of Nix> memory are getting steadily harder to find and steadily harder to run Nix> late versions of XEmacs on (and probably don't see many upgrades Nix> anyway), I am happy to *increase* the number of GCPROs substantially, to Nix> ensure that we can freely modify arbitrary functions without triggering Nix> GC hell. After all, GCPROs are really quite efficiently implemented; You're complaining about how slow xemacs is, and also "it's ok for me to make it a little slower". Nix> it's not as if they call malloc() or something like that. (As a Nix> *structure*, the gcprolist is praiseworthy. It's just the manual way Nix> it's kept updated that's disgusting and restrictive.) Offhand, it bothers me that most struct gcpro protect just one variable, so much of the time the initialization and reading of the nvars member is wasted. But I don't know whether this can be easily optimized away. Could one have a gcprolist with only `simple' variables, and a multiple_gcprolist that can protect arrays of variables?

trying to optimize gcpro is wasted effort. [in fact, all attempts to optimize individual machine cycles are wasted efforts -- all you're doing is decreasing the constant factors slightly. you should read the sections on O[n] and optimization in an introductory CS textbook.]

...

Nix> (Then I read of GCPRO and felt quite ill...) There are other health-sapping critters lurking in the xemacs source code...

-- ben I'm sometimes slow in getting around to reading my mail, so if you want to reach me faster, call 520-661-6661. See http://www.666.com/ben/chronic-pain/ for the hell I've been through.

Nix

Saturday, 26 May Sat, 26 May

1:24 p.m.

(permalink)

On Sat, 26 May 2001, Ben Wing uttered the following:

...

Martin Buchholz wrote: > Offhand, it bothers me that most struct gcpro protect just one > variable, so much of the time the initialization and reading of the > nvars member is wasted. But I don't know whether this can be easily > optimized away. Could one have a gcprolist with only `simple' > variables, and a multiple_gcprolist that can protect arrays of > variables? trying to optimize gcpro is wasted effort. [in fact, all attempts to optimize individual machine cycles are wasted efforts -- all you're doing is decreasing the constant factors slightly. you should read the sections on O[n] and optimization in an introductory CS textbook.]

Not strictly true; in some places GCPRO can contribute more than constant time; e.g. inside recursive functions, or functions that are indirectly called recursively. (There aren't many such yet that I can see; in one place the comment says this is because it was too hard to keep the GCPROs straight... :( But, yes, fixing the algorithms is always a better move. I will, of course, restrain myself from replacing GCPRO with something that is insanely slow (that, say, does dynamic allocation or something)... (One viewpoint is that GCPROs are extra complexity of a totally mechanical nature and so they should be dealt with by computers. That's what they're good at.) -- `Technology is meaningless. What matters is how people _think_ of it.' --- Linus Torvalds

Ben Wing

Saturday, 26 May Sat, 26 May

1:40 p.m.

(permalink)

hi. i've read some of this thread, and i wish you luck. a few comments: 1. awhile ago i wrote this: http://www.xemacs.org/Architecting-XEmacs/lisp-engine-replacement.html about eliminating the gcpro's and other related changes to improve the ability to replace the garbage collector and/or lisp engine. there is a lot of talk in here about using a c preprocessor to aid in this. some of the ideas here might be useful. 2. my biggest concern is "the last 20%". [it worries me, for example, that michael sperber's student may never get to merging his code and hasn't worked out his plans to do so.] i have seen a great number of efforts in the past involving major changes to the c code, and 9 out of 10 never make it back to the core and soon rot away into nothingness. "the last 20%" actually takes at least 50%-75% of the total time of the project, and isn't the fun part. programmers rarely account for this, and so they typically end up biting off more than they can chew, get stuck somewhere in the middle of the "last 20%", get disheartened and give up. i recommend that you spend some time right now and [a] consider what your plan is for (1) getting the code merged, (2) testing it on all major platforms and in all important configurations, (3) writing up detailed docs [ala my XEmacs Internals Manual] so others can maintain it, and (4) working out any rough edges [e.g. can we easily configure with your new code either on or off? in practice we can't afford to throw away the old code until long after the new code is in place, made the default, and hammered on to no end]; and [b] compute how much effort you think the "last 20%" will take, and then multiply by 3. you'll certainly need all that time. i don't mean to sound pessimistic; i just really want your project to succeed, and that means taking the steps right now to ensure its success. for example, you probably want to reduce the scope of your first pass down to a bare minimum, and restructure your effort so it can be done in stages. then, do the work right away to integrate and test your first pass; that way, you'll have a better sense of how much effort it actually takes to do the integration, and how much motivation you actually have. i am offering to help you in whatever capacity i can. e.g. i probably have a better overall sense of how the internals work than anyone else and can help you understand unclear areas or interactions between areas. also due to unfortunate personal circumstances i have learned how to do the planning and structuring i mentioned above so that i could get done long and difficult tasks given sometimes failing motivation or physical ability, and i'll gladly review any plan of action you come up with and/or help you construct such a plan. Nix wrote:

...

-- ben I'm sometimes slow in getting around to reading my mail, so if you want to reach me faster, call 520-661-6661. See http://www.666.com/ben/chronic-pain/ for the hell I've been through.

Nix

Sunday, 27 May Sun, 27 May

2:21 p.m.

(permalink)

On Sat, 26 May 2001, Ben Wing yowled:

...

hi. i've read some of this thread, and i wish you luck.

Thank you!

...

a few comments: 1. awhile ago i wrote this: http://www.xemacs.org/Architecting-XEmacs/lisp-engine-replacement.html

Hmm. Interesting. I agree with most of what that says, and it's nice to see I'm not going down an untrodden thought-path. I disagree that a self-hosting preprocessor (i.e. a preprocessor whose language is Emacs Lisp) would be terribly difficult. The Lisp reader and string manipulation code is fairly self-contained as it goes; rendering it completely so ought not to be too hard. All that would need to be done after that is to produce a bootstrap elisp interpreter by basically taking lread.c and eval.c and tying them together with a framework that simply doesn't bother to GC at all (so the lack of GCPROs in the interpreter is irrelevant). Then the makefile uses that to generate a `real' preprocessor by running it over the Lisp reader, runs *that* preprocessor over the Lisp reader and diffs the two. If there are no differences, we know the preprocessor works, and the makefile proceeds to use it to build all the rest of XEmacs. Self-hosting has a single really dull bit of slogwork (the initial construction of the bootstrap Lisp interpreter), but that only needs to be done occasionally, and it doesn't matter if it gets out of synch with the real Lisp interpreter, as long as it can understand enough of the subset of the language used by the preprocessor to preprocess itself. I also disagree that replacing GCPROs with a conservative collector is necessarily a good idea; I thought so until recently, but if it's already been tried and it was really inefficient; if incorrect identification of possible pointers on the stack was leaking a lot of memory, we can't do that and we need GCPRO. (Or rather, we need automatically inserted GCPROs.) I also don't think we can say anything much about how much time automatically inserted `GCPROs everywhere' can take until we've tried it; updating a GCPRO list is not a very expensive operation, after all.

...

2. my biggest concern is "the last 20%". [it worries me, for example, that michael sperber's student may never get to merging his code and hasn't worked out his plans to do so.] i have seen a great number of efforts in the past

I'm doing *that* now, if you mean merging the gc branch back to the head. (At least, I'm merging it to 2.4, and that's nearly the head as volume-of-changes goes.)

...

"the last 20%" actually takes at least 50%-75% of the total time of the project, and isn't the fun part. programmers

For me, the *really* fun part will be when I can run XEmacs without either weird GCPRO-related crashes (rare, but how can we ever be sure all of them are gone without automated insertion?) or massive disk churning due to the garbage collector's mega-paging :)

...

rarely account for this, and so they typically end up biting off more than they can chew, get stuck somewhere in the middle of the "last 20%", get disheartened and give up. i recommend that you spend some time right now and [a] consider what your plan is for (1) getting the code merged,

Wrong order. I do the docs *first*; that way I find design faults before the code is written, and don't have to write the docs later. (I just have to revise them.) (In this case it's complicated by needing to understand the system before I write the docs; I'm counting forward-porting Richard's changes as `writing the docs'; see below.)

...

(2) testing it on all major platforms and in all important configurations,

What's an `important configuration'? I can test on i586-pc-linux-gnu, sparc-sun-solaris2.5.1, hppa2.0-hp-hpux10.20, alphaev56-dec-osf4.0d, i586-pc-cygwin32, and I think I can manage an NT-native Visual C++ build too. I'll be frequently testing on i586-pc-linux-gnu, as it's my development box and I have easy access to it; the others will get big periodic tests.

...

(3) writing up detailed docs [ala my XEmacs Internals Manual] so others can maintain it,

That happens first :) and any changes I make should I think go into the garbage collection and memory allocation nodes in the internals manual.

...

and (4) working out any rough edges [e.g. can we easily configure with your new code either on or off?

I *hate* rough edges :) In practice I think configuring with code that rips out GCPRO would have to mean that the GCPROs stay and gets sedded out by configure, or #defined away, and of course that removes all the benefit of removing the GCPROs (said benefit being maintainability, not anything the user sees). GCPRO-removal is one of those things that there's little point configuring out. (Richard's changes, again, probably can't be configured out. Too big, too pervasive. But I'm not sure about this, and I'll *try* to make them configurable out...) The GC changes with functional effect themselves; yes, I plan to make that configurable --- although if it causes functional changes I'll count that as a bug.

...

in practice we can't afford to throw away the old code until long after the new code is in place, made the default, and hammered on to no end]; and [b] compute how much effort you think the "last 20%" will take, and then multiply by 3.

3? You're lucky. I normally multiply by 5 ;)

...

i don't mean to sound pessimistic; i just really want your project to succeed,

This is just allowable paranoia :) `Anything that can go wrong, will'; yes, sure, but then lots of things you didn't think could go wrong will go wrong *too*.

...

you probably want to reduce the scope of your first pass down to a bare minimum,

Forward-porting Richard's changes is pass 1. (And testing it, and documenting it... it appears to be completely undocumented; not even the changelogs are updated. Fixing *that* latter should be fun; I'd like to get real changelog entries if they exist, but... the CVS changelog entries don't really count, they're not detailed enough.) The forward-porting /per se/ shouldn't be too hard. Just a lot of slogwork. Bringing the docs up to date will probably take at *least* as long :) I'm thinking of the forward-port as a documentation job as much as anything else.

...

i am offering to help you in whatever capacity i can. e.g. i probably have a better overall sense of how the internals work than anyone else and can help you understand unclear areas or interactions between areas. also due to unfortunate

Thank you! I may well take you up on that :)

...

personal circumstances i have learned how to do the planning and structuring i

I'm perenially disorganized unless I force myself to be organized. As a result, on anything I care about I organize *first* so I can forget about it later on. I already have a plan of attack for the merge, and I know what order I'm doing the rest in and why. (It might change, as is usual with such things...)

...

mentioned above so that i could get done long and difficult tasks given sometimes failing motivation or physical ability, and i'll gladly review any

I think of this as `rewarding', not `difficult'. I *like* cleaning up horrible messes, because I hate the messes so much (GCPRO sounds like a good candidate ;) )

...

plan of action you come up with and/or help you construct such a plan.

I may well lob my plan of attack in your direction so you can laugh at it for being hopelessly impractical then :) -- `Technology is meaningless. What matters is how people _think_ of it.' --- Linus Torvalds

8771

days inactive

8778

days old

xemacs-beta@xemacs.org

Manage subscription

31 comments

13 participants

tags (0)

participants (13)

Adrian Aichner
Andy Piper
Ben Wing
Martin Buchholz
Michael Livshin
Michael Sperber
Mike Alexander
Nix
Olivier Galibert
Ovidiu Predescu
Robert Pluim
Stephen J. Turnbull
Yoshiki Hayashi

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Forthcoming revisions to the garbage collector; crazy huge hack