Re: Forthcoming revisions to the garbage collector; crazy huge hack

Sunday, 27 May 2001

        On Sat, 26 May 2001, Ben Wing yowled:
...
 hi.  i've read some of this thread, and i wish you luck. 
Thank you!

...
 a few comments:

 1. awhile ago i wrote this:

 http://www.xemacs.org/Architecting-XEmacs/lisp-engine-replacement.html 
Hmm. Interesting. I agree with most of what that says, and it's nice to
see I'm not going down an untrodden thought-path.

I disagree that a self-hosting preprocessor (i.e. a preprocessor whose
language is Emacs Lisp) would be terribly difficult. The Lisp reader and
string manipulation code is fairly self-contained as it goes; rendering
it completely so ought not to be too hard. All that would need to be
done after that is to produce a bootstrap elisp interpreter by basically
taking lread.c and eval.c and tying them together with a framework that
simply doesn't bother to GC at all (so the lack of GCPROs in the
interpreter is irrelevant).

Then the makefile uses that to generate a `real' preprocessor by running
it over the Lisp reader, runs *that* preprocessor over the Lisp reader
and diffs the two. If there are no differences, we know the preprocessor
works, and the makefile proceeds to use it to build all the rest of
XEmacs.

Self-hosting has a single really dull bit of slogwork (the initial
construction of the bootstrap Lisp interpreter), but that only needs to
be done occasionally, and it doesn't matter if it gets out of synch with
the real Lisp interpreter, as long as it can understand enough of the
subset of the language used by the preprocessor to preprocess itself.

I also disagree that replacing GCPROs with a conservative collector is
necessarily a good idea; I thought so until recently, but if it's
already been tried and it was really inefficient; if incorrect
identification of possible pointers on the stack was leaking a lot of
memory, we can't do that and we need GCPRO. (Or rather, we need
automatically inserted GCPROs.)

I also don't think we can say anything much about how much time
automatically inserted `GCPROs everywhere' can take until we've tried
it; updating a GCPRO list is not a very expensive operation, after all.

...
 2. my biggest concern is "the last 20%". [it worries me,
for example, that
 michael sperber's student may never get to merging his code and hasn't worked
 out his plans to do so.] i have seen a great number of efforts in the past 
I'm doing *that* now, if you mean merging the gc branch back to the
head. (At least, I'm merging it to 2.4, and that's nearly the head as
volume-of-changes goes.)

...
                                           "the last 20%"
actually takes at least
 50%-75% of the total time of the project, and isn't the fun part.  programmers

For me, the *really* fun part will be when I can run XEmacs without
either weird GCPRO-related crashes (rare, but how can we ever be sure
all of them are gone without automated insertion?) or massive disk
churning due to the garbage collector's mega-paging :)

...
 rarely account for this, and so they typically end up biting off more
than they
 can chew, get stuck somewhere in the middle of the "last 20%", get
disheartened
 and give up.  i recommend that you spend some time right now and [a] consider
 what your plan is for (1) getting the code merged, 
Wrong order. I do the docs *first*; that way I find design faults before
the code is written, and don't have to write the docs later. (I just
have to revise them.)

(In this case it's complicated by needing to understand the system
before I write the docs; I'm counting forward-porting Richard's changes
as `writing the docs'; see below.)

...
                                                    (2) testing it on
all major
 platforms and in all important configurations, 
What's an `important configuration'? I can test on i586-pc-linux-gnu,
sparc-sun-solaris2.5.1, hppa2.0-hp-hpux10.20, alphaev56-dec-osf4.0d,
i586-pc-cygwin32, and I think I can manage an NT-native Visual C++ build
too.

I'll be frequently testing on i586-pc-linux-gnu, as it's my development
box and I have easy access to it; the others will get big periodic
tests.

...
                                                (3) writing up
detailed docs [ala
 my XEmacs Internals Manual] so others can maintain it, 
That happens first :) and any changes I make should I think go into the
garbage collection and memory allocation nodes in the internals manual.

...
                                                        and (4)
working out any
 rough edges [e.g. can we easily configure with your new code either on or off?  
I *hate* rough edges :)

In practice I think configuring with code that rips out GCPRO would have
to mean that the GCPROs stay and gets sedded out by configure, or
#defined away, and of course that removes all the benefit of removing
the GCPROs (said benefit being maintainability, not anything the user
sees). GCPRO-removal is one of those things that there's little point
configuring out.

(Richard's changes, again, probably can't be configured out. Too big,
too pervasive. But I'm not sure about this, and I'll *try* to make them
configurable out...)

The GC changes with functional effect themselves; yes, I plan to make
that configurable --- although if it causes functional changes I'll
count that as a bug.

...
 in practice we can't afford to throw away the old code until long
after the new
 code is in place, made the default, and hammered on to no end]; and [b] compute
 how much effort you think the "last 20%" will take, and then multiply by 3. 

3? You're lucky. I normally multiply by 5 ;)

...
 i don't mean to sound pessimistic; i just really want your
project to succeed, 
This is just allowable paranoia :) `Anything that can go wrong, will';
yes, sure, but then lots of things you didn't think could go wrong will
go wrong *too*.

...
 you probably want to reduce the scope of your first pass down to a
bare minimum, 
Forward-porting Richard's changes is pass 1. (And testing it, and
documenting it... it appears to be completely undocumented; not even the
changelogs are updated. Fixing *that* latter should be fun; I'd like to
get real changelog entries if they exist, but... the CVS changelog
entries don't really count, they're not detailed enough.)

The forward-porting /per se/ shouldn't be too hard. Just a lot of
slogwork. Bringing the docs up to date will probably take at *least* as
long :) I'm thinking of the forward-port as a documentation job as much
as anything else.

...
 i am offering to help you in whatever capacity i can.  e.g. i
probably have a
 better overall sense of how the internals work than anyone else and can help you
 understand unclear areas or interactions between areas.  also due to unfortunate 
Thank you! I may well take you up on that :)

...
 personal circumstances i have learned how to do the planning and
structuring i 
I'm perenially disorganized unless I force myself to be organized. As a
result, on anything I care about I organize *first* so I can forget
about it later on. I already have a plan of attack for the merge, and I
know what order I'm doing the rest in and why. (It might change, as is
usual with such things...)

...
 mentioned above so that i could get done long and difficult tasks
given
 sometimes failing motivation or physical ability, and i'll gladly review any 
I think of this as `rewarding', not `difficult'. I *like* cleaning up
horrible messes, because I hate the messes so much (GCPRO sounds like a
good candidate ;) )

...
 plan of action you come up with and/or help you construct such a
plan. 
I may well lob my plan of attack in your direction so you can laugh at
it for being hopelessly impractical then :)

-- 
`Technology is meaningless. What matters is how people _think_
 of it.' --- Linus Torvalds

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: Forthcoming revisions to the garbage collector; crazy huge hack