Abstract: Currently, during the build stage of XEmacs, a bare version of the program (called temacs) is run, which loads up a bunch of Lisp data and then writes out a modified executable file. This process is very tricky to implement and highly system-dependent. It can be replaced by a simple, mostly portable, and easy to implement scheme where the Lisp data is written out to a separate data file.
The scheme makes only three assumptions about the memory layout of a running XEmacs process, which, as far as I know, are met by all current implementations of XEmacs (and they're also requirements of the existing unexec scheme):
Assumption number three means that this scheme is non-relocatable, which is a disadvantage as compared to other, relocatable schemes that have been proposed. However, the advantage of this scheme over them is that it is much easier to implement and requires minimal changes to the XEmacs code base.
First, let's go over the theory behind the dumping mechanism. The principles that we would like to follow are:
XEmacs, of course, is already set up to adhere to most of these principles.
In fact, the current dumping process that we are replacing does a few of these principles slightly differently and adds a few extra of its own:
The difficult part in this process is figuring out where our data structures lie in memory so that we can correctly write them out and read them back in. The trick that we use to make this problem solvable is to ensure that the heap that is used for all dynamically allocated data structures that are created during the dumping process is located inside the memory of a large, statically declared array. This ensures that all of our own data structures are contained (at least at the time that we dump out our data) inside the static initialized and uninitialized data segments, which are physically separated in memory from any data treated by system libraries and whose starting and ending points are known and unchanging (we know that all of these things are true because we require them to be so, as preconditions of being able to make use of this method of dumping).
In order to implement this method of heap allocation, we change the
memory allocation function that we use for our own data. (It's
extremely important that this function not be used to allocate system
data. This means that we must not redefine the malloc
function using the linker, but instead we need to achieve this using
the C preprocessor, or by simply using a different name, such as
xmalloc
. It's also very important that we use the
correct free
function when freeing dynamically-allocated
data, depending on whether this data was allocated by us or by the
system. If we don't keep this straight, we are likely to corrupt
memory and cause XEmacs to crash.) What our own memory allocation
function does is, depending on the circumstances, either call our
own memory allocation subfunction (probably based on the routines in
gmalloc.c
), which allocates memory out of a virtual heap
that we have set up using a large statically-declared array, or simply
calls the standard malloc
function to do the memory
allocation. Similarly, the free
function that we use
either calls our own free subfunction or calls the standard one. (In
this case, it's clear which of the two subfunctions we use. We just
look at the pointer that was given to us, and see if it's within our
large static array or not). The rules governing which of the two
allocation subfunctions is used are as follows:
malloc
function from then on. (Alternatively, we could always call our own
allocation subfunction and then call the standard one whenever our own
one fails. This would use memory more efficiently, but would be
slower. Another alternative that avoids this trade-off but
constricts the choice of allocation methods that we can use is to
scrap this two-mode allocation scheme entirely and simply provide an
allocation function that can cope with having its heap be in two
non-contiguous areas of memory. I think that the routines in
gmalloc.c
can deal with this, for example).
When it's time to dump out our data, we don't have to do anything
complicated involving creating a new executable file like we do
currently. All we have to do is write out the data contained in our
uninitialized and initialized data segments to a data file. At the
beginning of main
, the first thing we do is check to see
whether we are running as temacs
or as
xemacs
. If we are running as xemacs
, then
the first thing we do is locate our data file, which should probably
be named xemacs.dat
, and be located in the same directory
as the XEmacs executable. Then we load in the data from
this data file, overwriting our initialized and uninitialized data
segments, and continue with XEmacs as normal. (There is no danger
in overwriting things like this because this is the first or almost
the very first thing that we do, and we're not going to be overwriting
any system data that might have been created or initialized before
main
was called. We have to be careful, however, with
the small number of variables that we initialized in the process of
determining whether we should load our data file and then loading this
data file.)
I think that the way we determine whether we are running as
temacs
or xemacs
is:
temacs
.
temacs
.
-no-data-file
, we are running as temacs
.
In all of the other circumstances, we load the data file normally
and proceed as if this were a normal xemacs
invocation.
We can do a further optimization because of the clever way that
XEmacs arranges to never write to any variables that exist in the
initialized static data segment after the dump phase. When we read in
the initialized data segment, instead of reading it in normally using
the read
system call, we use mmap
if it is
available. In the call to mmap
, we specify the start of
the initialized data segment as the first argument, and then we
specify the flags MAP_FIXED
and MAP_SHARED
.
This way, the initialized data segment will be read-only and shared
among all XEmacs processes on the same machine. (When reading in the
uninitialized data segment, we should probably do a similar thing
involving mmap
, but use the MAP_PRIVATE
flag
instead of MAP_SHARED
so that this data segment
essentially becomes copy-on-write.) Memory mapping like this can also
be done on Windows; the function is different from mmap
,
but as far as I know the semantics are equivalent.