[Sorry for the delay; a cold. I didn't feel I could respond to this
until I was capable of thinking again...]
On Tue, 22 May 2001, Martin Buchholz said:
The alloc routines know what they're allocating, but they
don't know
about the references to the memory being allocated. Unless you do
something very radical, you have to gcpro the address of the
Lisp_Object on the stack that is about to receive the pointer to the
allocated memory. This means that Fcons can't gcpro anything.
Gaah. You're right, of course. (Stupid of me.)
Perhaps I'm missing something.
No, of course not; but this doesn't add much complexity to what I'm
planning --- which probably indicates that it's hugely overcomplicated
already :)
The standard tricky thing about implementing a preprocessor that
adds
gcpros automatically is that sometimes the Lisp_Objects that have to
be protected aren't even declared - they're temporaries.
Absolutely.
For example
return Fcons_user1 (Fcons_user2 (Fcons (Qnil, Qnil)));
This has to be expanded into
{
Lisp_Object x = Qnil, y = Qnil, z = Qnil;
struct gcpro gcpro1, gcpro2, gcpro3;
GCPRO3 (x, y, z);
x = Fcons (Qnil, Qnil);
y = Fcons_user2 (x);
z = Fcons_user3 (y);
UNGCPRO;
return z;
}
This was almost exactly the transformation I was expecting to have to
apply, although I intend that the GCPRO and UNGCPRO macros don't survive
in their current form with this preprocessor; the limits on the numbers
of objects that can be GCPROed in one block *will* go away, for
instance; it irks me. (Of course their replacements will do the same
thing, and will probably be the same code, just emitted already
expanded.)
(this example could of course be optimized - but this is the obvious
translation)
... and probably the one I'd apply.
(this example also shows how incredibly ugly gcpros are)
A good reason to automate away their creation then, so no humans have to
look at them :)
The current source code does a lot fewer GCPROs than the above,
because of detailed knowledge of various functions (can they gc, or
This is a very bad idea, I think. It means that much of the XEmacs
source code has detailed knowledge of the internal implementation
details of other parts of the source code; this reduces stability ('cos
some *will* be missed) and drastically increases the difficulty of
changing anything.
I think this should be handled in a similar way to the way GCC works out
what registers are going to be used in subfunctions; assume they'll all
be clobbered. The expenditure of time is minimal, in any case.
(If the time wastage gets extreme in some hot spots, we can optimize the
hot spots; but in general I think that every stack slot that holds a
Lisp_Object should be protected. XEmacs isn't GCPRO-bound, it's GC-bound
;) )
could they return a fresh object). It is possible to determine this
kind of information from a global analysis of the source code, but it
I'm not even going to *try* to do that. It's probably impossible in the
general case, anyway (I'd need to think about it, but I'd be surprised
if that kind of detail of global analysis didn't reduce to the halting
problem.)
won't be easy. In particular, the C preprocessor will tend to
make
the source more resistant to automated understanding by your gcpro
preprocessor. E.g.
I'm going to stitch this in after the preprocessor has run on a given
source file. (I'd have to preprocess it anyway, so it seems sensible to
only preprocess it once!)
Oh, FWIW the preprocessor should be able to handle just about anything,
'cos I'm cheating. I'm taking c-parse.y and cpplex.c from GCC, and
tearing them down into something that can unambiguously spot C blocks
(`stmts_and_decls'), assignments (`expr_no_commas'), Lisp_Object
variable declarations (`decl' nodes), and temporaries. Everything else
that I can throw away I will; we don't need a complete C parser, let
alone the Objective C parts (sorry, Ovidiu ;) ).
In one respect it'll be more complicated than GCC's parser; it needs to
maintain knowledge of where in the pre-lexed sources a given lexical
component comes from, so we can rewrite the temporaries; but it
shouldn't be terribly hard. (I think GCC also contains some
temporary-rewriting stuff I can pinch from; building a temporary is an
RTL-level equivalent of what this preprocessor would have to do on the
source code level.
Fun. (But if this were C++ it would not be fun, it would be torment...)
Sorry, I meant there's little scope for a Boehm mark() to be
better
than the existing mark() because the existing one, via lisp object
type mark methods, already knows exactly which object components might
contain other Lisp_Objects. The mark bitmap won't help you much
during the mark phase, in fact it ought to make things worse.
Hmm. Why is that? The current garbage collector has appalling VM
behaviour; every Lisp_Object is sucked back into memory, even if they
cannot contain other Lisp_Objects. If a mark bitmap is used, a
Lisp_Object will only be accessed for pointer tracing; if it cannot
contain another Lisp_Object, it won't be touched.
I'll admit that I haven't instrumented the existing gc to find out what
percentage of objects are accessed only to set the mark bit; I'll do so
as soon as this cold has worn off. (I'll feel a right idiot if the
answer is something like 1%, too...)
The place where the Boehm gc shines performance wise is not during
gc
at all, but in the fact that gcpros can be dispensed with entirely
while the mutator runs. Normally the advantage of mark&sweep over
reference counting is: not slowing down the mutator. But gcproing does
slow down the mutator. But I really don't know how much overhead we
have from GCPROing, and it's hard to measure. My random guess is that
10% of the runtime of lisp function calls that do no work (i.e. return
immediately) is taken up with gcproing.
However, a (not huge) general slowdown like that is hard for humans to
spot. What humans definitely *do* spot (well, I know I've spotted it,
and so have friends of mine, new XEmacs hands and old) is the total
stoppage we get whenever GCing runs and the world freezes except for the
hammering of the disk :( if XEmacs is totally in swap, it can be frozen
for a minute or more while GC runs.
GCPROs suck, but the only ways we can eliminate them are conservative
collection (not very portable and can leave cruft around, as you said),
some really hefty global analysis (I'll leave that for a braver soul),
or a compiler that knows how to emit typing information (and while we
could probably get an option added to GCC that would emit such
information, I don't think we can decree that XEmacs can only be built
by GCC-3.1 and above...)
Given Moore's Law and the fact that machines with small amounts of
memory are getting steadily harder to find and steadily harder to run
late versions of XEmacs on (and probably don't see many upgrades
anyway), I am happy to *increase* the number of GCPROs substantially, to
ensure that we can freely modify arbitrary functions without triggering
GC hell. After all, GCPROs are really quite efficiently implemented;
it's not as if they call malloc() or something like that. (As a
*structure*, the gcprolist is praiseworthy. It's just the manual way
it's kept updated that's disgusting and restrictive.)
Nix> -- and, if I can get it to work --- and I damned well will ;)
--- a
Nix> GCPROizing preprocessor, so we can dump *visible* evidence of GCPRO
Nix> (that being what does the harm; we can forget it exists completely
Nix> if it is automatically maintained!)
Harder than it looks, as I try to point out above. I'll be very
impressed if you can produce a reliable GCPROizing preprocessor.
We have the advantage of a pre-existing free software project that is
quite capable of understanding C code (GCC), and which does far more
elaborate transformations than this preprocessor will have to; and I'm
happy to nick code from it. (I'll probably mention such large-scale
borrowing on the GCC list if I do it, if Ovidiu doesn't beat me to
it...)
The GCC C parser is actually quite neat, as such things go...
I think you'll have to run the source through the C preprocessor
first, which means your gcproizer has to run every build. Using the
Yes.
standard Unix utilities, this will be hard (i.e. you'll want to
use
something like perl or python). Most free software projects don't
Not a chance; this'll be in C. Parsing is what yacc is good for, and as
usual we'd ship the results of running yacc on the preprocessor's .y
files, so the builders won't need yacc.
(Plus, GCC is already written in C, and while I'm willing to attack
c-parse.y and cpplex with a stone axe to form the core of this
preprocessor, I'm really not willing to translate them into Perl; I have
more taste than that ;) )
have such dependencies for non-maintainer-mode, but I think the time
might be right to introduce such a dependency for xemacs.
No need. (I wouldn't want that dependency for purely personal reasons;
one of the sites I run xemacs on has no perl or python, and I can't
install either because I just barely have the space for xemacs. I know
where *my* priorities lie ;) )
ObAdvice: Have you read the internals manual section that deals with
gc?
Yes, of course, long before I mentioned this plan here, when I harboured
the secret desire that the fix to XEmacs's GC performance would be
simple. (Then I read of GCPRO and felt quite ill...)
--
`LARTing lusers is supposed to be satisfying. This is just tedious. The
silly shite I'm doing now is like trying to toothpick to death a Black
Knight made of jelly.' --- RDD