profiling XEmacs, esp. on the C level
crestani at informatik.uni-tuebingen.de
Thu Nov 24 20:23:31 EST 2005
Thanks for your patch, Jan. A few comments now, I'll take a closer
>>>>>"JR" == Jan Rychter <jan at rychter.com> writes:
JR> 3. The gc_checking_assert in get_mark_bit really hurts. I'd suggest that
JR> it be removed as soon as we know it doesn't get hit under normal
It should stay in for development builds. When you profile, it would
be much better to recompile XEmacs with `--disable-error-checking'.
This removes a lot of time overhead (it also removes the costly
gc_checking_asserts) and brings better results.
How big is the performace benefit of your patch in a
JR> It will only work for one mark bit per object. Do you actually
JR> plan to use more? We will likely lose a lot of the efficiency if
JR> we conditionalize this.
The incremental garbage collector needs two mark bits, so we have to
move the focus on finding optimizations for the two-bit case.
JR> After the optimizations, the worst performance loss occurs in
JR> kkcc_marking, and that's because of pipeline stalls due to
JR> mispredicted branches. I don't know what we could do about this,
JR> other than rewrite the switch as a series of conditionals ordered
JR> so that branch prediction can do a better job.
My todo list contains the following: A basic re-design of the object
layout by separating pointer-containing cells from no-pointer opaque
data would speed up the mark phase. This way, the traversal has no
need to parse a object, it can assume that every cell in the pointer
part of the object has to be examined. So no switch is needed at all.
More information about the XEmacs-Beta