Ben Wing wrote:
unicode-internal now gets most of the way through the build process;
currently, it gets most of the way through autoload re-creation before
hitting a syntax error.
but in the process i've noticed that kkcc is *unbelievably* slow with
the new char tables i've designed. or at least i assume that's where
the slowness is; this is the only major change i've made to any
objects. without kkcc, gc is fast but with it, it rapidly balloons,
to the point where it takes 2 *minutes* or more to do a gc once all
modules are loaded.
the format of the char tables is something like a trie. a typical
char table for unicode has three levels. the top level is an array of
256 elements indexing bits 23-16 of a unicode char; each element
points to another array of 256 elements, indexing bits 15-8; each
element of the second level also points to an array of 256 elements,
for bits 7-0, whose values are Lisp objects. there will be a fourth
level on top of the other three if a unicode char of 0x1000000 or
greater is seen. in order to avoid a memory explosion, each place at
which there are no defined elements points to a shared "blank" table.
hence, originally the char table contains a single 256-element array,
each of whose elements points to the same empty table. when a value
for a particular character is added, the tables are expanded along
that path.
because the mark method is a method, i can put logic into it to check
to see whether a blank table is being traversed and stop traversing at
that point. the problem seems to be that kkcc can't be told this, and
isn't smart enough (at least i don't think so) to recognize whether
it's already traversed something unless it's a Lisp object -- and the
subtables aren't Lisp objects. i could make them Lisp objects, but
this means either [1] i integrate the header and the table, leading to
an object whose size is slightly over a power of 2 and hence its
allocation is difficult to manage efficiently (i assume, at least --
marcus, how does mc-alloc handle this case?); or [2] i separate them
into two objects, which doubles the number of memory accesses to look
up a character.
which is the lesser of two evils?
i guess i answered my own question, after marcus's posting about
incremental gc. for compability with the new-gc, option [1] is the only
possibility. marcus, how will mc-alloc handle objects of size 1028?
will it just shove 3 of them into a 4096-byte page and have wasted space
in the rest? (c. 24.7% waste) or is it able to use the rest of the
space for other objects? even in the former case, the waste might not
be significant; depends on how many char tables there are.
ben