"Stephen J. Turnbull" <stephen(a)xemacs.org> writes:
2. poorly localized Lisp data means the first GC puts all of the
Lisp data into "touched" category and it can't be shared any more
(thus I have heard from Hrvoje Niksic, I am not an expert)
To expand: the mark&sweep GC "marks" each reachable object by changing
a bit in the object header. This is always a problem on systems with
virtual memory, because it causes all of XEmacs to be retrieved from
swap, even the parts that have not and will not be needed for a long
time.[1]
It used to be that the "pure" space (consisting of the objects
"dumped" with XEmacs for use at run-time) was exempt from this by the
virtue of never being garbage collected. Additionally, the pure
objects were placed in the text segment of the executable, so that on
a multi-user system each copy of XEmacs truly shared the dumped data.
With pdump, the formerly pure data gets mmap'ed privately, with the
flag that if anything gets changed, a local copy is made ("copy on
write"). This is still shared among processes -- up until the first
garbage collection, when every page that contained a marked Lisp
object gets copied because the object's mark bit gets flipped. In
practice, it means that the first GC "unshares" most of the dumped
Lisp data.
I would love to be wrong about this, so if somebody can correct me, he
is very welcome.
It's default on several platforms, mostly because it was a cheap
port.
I don't think it being by default on those platform is a particularly
bad thing. A memory-greedy XEmacs is better than no XEmacs, the
memory is cheap, and it's not like multi-user systems are so popular
today.
Still, I'd like people not to forget that the portable dumper is not
yet finished, and that using it unconditionally has the potential to
cause problems. Multi-user environments are still present
occasionally (universities, company development servers, larger
systems).
Footnotes:
[1]
This problem is not unique to XEmacs. Someone once complained on a
Python newsgroup that his Python process initialized a huge dictionary
and forked. The forked process was only reading from the dictionary
and serving incoming requests, and yet it grew in memory. The reason
for that was that each access to a dictionary object increased (and
later decreased) its refcount. After a while, all of the objects got
copied to separate memory, despite the child process never changing a
thing.