On Fri, Oct 08, 1999 at 09:04:15PM -0700, Ben Wing wrote:
> Before you get too much farther, is it possible that you could write up a
> basic document describing the design of your portable dumper (for eventual
> inclusion in the XEmacs Internals Manual)?
Ok, lemme write it. Martin and others, please proof-read heavily.
I'm french, after all, english's not my native language, and it's 3 in
the morning ;-)
I wouldn't yell about someone moving things around/rewriting parts if
he fells it makes the document better either.
I've never gotten around learning the texinfo tags, sorry. Questions,
comments, whatever, welcome, of course.
OG.
1. What is dumping and its justification
The C code of XEmacs is only a lisp engine with a lot of built-in
primitives useful to write an editor. The editor itself is written in
Lisp, and represents around 100K lines of code. Loading and executing
the initialisations of all this codes takes a bit a time (five to ten
times the usual startup time of current xemacs) and requires to have
all the lisp source files around. Having to reload them each time the
editor is stardted would not be acceptable.
The traditional solution to this problem is called dumping: the
building process first creates the lisp engine under the name temacs,
then runs it until it has finished loading and initialising all the
lisp part, and eventually creates a new executable called xemacs
including both temacs and all the contents of the memory after the
initialisation.
This solution, while working, has a huge problem: the creation of the
new executable from the actual contents of the memory is a extremely
system-specific process, quite error-prone, and which interferes with
a lot of system librairies (like malloc). It is even getting worse
nowadays with librairies using constructors which are automatically
called when the program is started (even before main()) which tend to
crash when they are called multiple times, once before dumping and
once after (IRIX 6.x libz.so pulls in some C++ image librairies thru
dependencies which have this problem). Writing the dumper is also one
of the most difficult parts of porting XEmacs to a new operating
system.
The aim of the portable dumper is to solve the same problem than the
system-specific dumper, that is the be able to reload quickly and
using a small number of files the fully initialized lisp part of the
editor, without any system-specific hacks.
2. The process
2.1. Overview
The portable dumping system has to:
- at dump time, write all initialized, non quickly rebuildable data to
a file [Note: currently xemacs.dmp, but the name will change], along
with all informations needed for the reloading.
- when starting xemacs, reload the dump file, relocate it to its new
starting address if needed, and reinitialise all pointers to this
data. Also, rebuild all the quickly rebuildable data.
2.2. Data descriptions
The more complex task of the dumper is to be able to write lisp
objects (lrecords) and C structs to disk and reload them at a
different address, updating all the pointers they include in the
process. This is done by using external data descriptions that give
information about the layout of the structures in memory.
The specification of these descriptions is in lrecord.h. A
description for a lrecord is an array of struct lrecord_description.
Each of these structs include a type, an offset in the structure and
some optional parameters depending on the type. For instance, here is
the string description:
static const struct lrecord_description string_description[] = {
{ XD_LISP_OBJECT, offsetof(Lisp_String, plist), 1 },
{ XD_OPAQUE_DATA_PTR, offsetof(Lisp_String, data), XD_INDIRECT(2, 1) },
{ XD_BYTECOUNT, offsetof(Lisp_String, size) },
{ XD_END }
};
The first line means "there is a Lisp_Object at plist in the
Lisp_String structure". The second means "there is a pointer to some
opaque data in the field 'data'". The length of said data is given by
the expression XD_INDIRECT(2, 1), which means "the value in the 3rd
line of the description (line number 2 starting counting at 0, welcome
to C) plus one". The third line indicates a member of type Bytecount,
which is used by the previous indirect directive. XD_END then ends
the description.
This gives all the information we need to move around what is pointed
by a structure (C or lrecord) and, by transitivity, everything that
points to it. The only missing information for dumping is the size of
the structure. For lrecords, this is part of the
lrecord_implementation, hence we do not need to duplicate it. For C
structures we use a struct struct_description, which includes a size
field and a pointer to an associated array of lrecord_description.
2.3. Dumping phase
2.3.1. Start
The dumping is done by calling the function pdump() (in alloc.c) which
is done by Fdump_emacs (in emacs.c). This function does a number of
tasks.
2.3.2. Object inventory
The first task is to build the list of the objects to dump. This
includes:
- lisp objects
- C structures
We end up with one pdump_entry_list_elmt per object group (arrays of C
structs are kept together) which includes a pointer to the first
object of the group, the per-object size and the count of objects in
the group, along with some other informations which are initialised
later.
These entries are linked together in pdump_entry_list structures and
can be enumerated thru either:
- the pdump_object_table, an array of pdump_entry_list, one per
lrecord type, indexed by type number.
- the pdump_opaque_data_list, used for the opaque data which does not
include pointers, and hence does not need descriptions.
- the pdump_struct_table, which is a vector of
struct_description/pdump_entry_list pairs, used for non-opaque C
structures.
This uses a marking strategy similar to the garbage collector. Some
differences though:
- we do not use the mark bit (which does not exist for C structures
anyway), we use a big hash table instead.
- we do not use the mark function of lrecords but instead rely on the
external descriptions. This happens essentially because we need to
follow pointers to C structures and opaque data in addition to
Lisp_Object members.
This is done by pdump_register_object, which handles Lisp_Object
variables, and pdump_register_struct which handles C structures, which
both delegate the description management to pdump_register_sub.
The hashtable doubles as a map object to pdump_entry_list_elmt (i.e,
allows to lookup a pdump_entry_list_elmt with the object it points
to). Entries are added with pdump_add_entry and looked up with
pdump_get_entry. There is no need for entry removal. The hash value
is computed quite basically from the object pointer by
pdump_make_hash.
The roots for the marking are:
- the staticpro'ed variables (there is a special staticpro_nodump call
for protected variables we do not want to dump).
- the pdump_wire'd variables (staticpro is equivalent to
staticpro_nodump + pdump_wire).
- the dumpstruct'ed variables, which points to C structures.
This does not include the GCPRO'ed variables, the specbinds, the
catchtags, the backlist, the redisplay or the profiling info, since we
do not want to rebuild the actual chain of lisp calls which end up to
the dump-emacs call, only the global variables.
Weak lists and weak hash tables are dumped as if they were their
non-weak equivalent (without changing their type, of course). This
has not yet been a problem.
2.3.3. Address allocation
The next step is to allocate the offsets of each of the objects in the
final dump file. This is done by pdump_allocate_offset which is
called indirectly by pdump_scan_by_alignement.
The strategy to deal with alignment problems uses these facts:
- real world alignment requirements are powers of two.
- the C compiler is required to ajust the sizeof of a struct so that
you can have an array of them next to each other. This means you
can have a upper bound of the alignment requirements of a given
structure by looking at which power of two its size is a multiple.
- the non-variant part of variable size lrecords has an alignement
requirement of 4.
Hence, for each lrecord type, C struct type or opaque data block the
alignment requirement is computed as a power of two, with a minimum of
2^2 for lrecords. pdump_scan_by_alignement then scans all the
pdump_entry_list_elmt the ones with the highest requirements first.
This ensures the best compacity.
The maximum alignment requirement we take into account is 2^8.
pdump_allocate_offset only has to do a linear allocation, starting at
offset 256 (this leaves room for the header and keep the alignments
happy).
2.3.4. The header
The next step creates the file and writes a header with a signature
and some random informations in it (number of staticpro, number of
assigned lrecord types, etc...). The reloc_address field, which
indicates at which address the file should be loaded if we want to
avoid post-reload relocation, is set to 0. It then seeks to offset
256 (base offset for the objects).
2.3.5. Data dumping
The data is dumped in the same order the address were allocated by
pdump_dump_data called from pdump_scan_by_alignement. This function
copies the data to a temporary buffer, relocates all pointers in the
object to the addresses allocated in step 2.3.3, and writes it to the
file. Using the same order means that, if we are careful with
lrecords whose size is not a multiple of 4, we are ensured that the
object is always written at the offset in the file allocated in step
2.3.3.
2.3.6. Pointers dumping
A bunch of tables needed to reassign properly the global pointers are
then written. They are:
- the staticpro array
- the dumpstruct array
- the lrecord_implementation table array
- a vector of all the offsets to the objects in the file that include
a description (for faster relocation at reload time)
- the pdump_wired and pdump_wired_list arrays
For each of the arrays we write both the pointer to the variables and
the relocated offset of the object they point to. Since these
variables are global the pointers are still valid when restarting the
program and are used to regenerate the global pointers.
The pdump_wired_list array is a special case. The variable it points
to are the head of weak linked lists of lisp objects of the same type.
Not all objects of this list are dumped so the relocated pointer we
associate with them points to the first dumped object of the list, or
Qnil if none is available. This is also the reason why they are not
used as roots for the purpose of object enumeration.
This is the end of the dumping part.
2.4. Reloading phase
2.4.1. File loading
The file is mmap'ed in memory (which ensures a PAGESIZE alignment, at
least 4096), or if mmap is unavailable or fails a 256-bytes aligned
malloc is done and the file is loaded.
Some variables are reinitialized from the values found in the header.
The difference between the actual loading address and the
reloc_address is computed and will be used for all the relocations.
2.4.2. Putting back the staticvec
The staticvec array is memcpy'd from the file and the variables it
points to are reset to the relocated objects adresses.
2.4.3. Putting back the dumpstructed variables
The variables pointed to by dumpstruct in the dump phase are reset to
the right relocated object addresses.
2.4.4. lrecord_implementations_table
The lrecord_implementations_table is reset to its dump time state and
the right lrecord_type_index values are put in.
2.4.5. Object relocation
All the objects are relocated using their description are their offset
by pdump_reloc_one. This step is unnecessary if the reloc_address is
equal to the file loading address.
2.4.6. Puttin back the pdump_wire and pdump_wire_list variables
Same as 2.4.3.
2.4.7. Reorganize the hash tables
Since some of the hash values in the lisp hash tables are
address-dependant, their layout is now wrong. So we go through each
of them and have them resorted by calling reorganize_hash_table.
3. Remaining issues
The build process will have to start a post-dump xemacs, ask it the
loading address (which will, hopefully, be always the same between
different xemacs invocations) and relocate the file to the new
address. This way the object relocation phase will not have to be
done, which means no writes in the objects and that, because of the
use of mmap, the dumped data will be shared between all the xemacs
running on the computer.
Some executable signature will be necessary to ensure that a given
dump file is really associated with a given executable, or random
crashed will occur. Maybe a random number set at compile or configure
time thru a define. This will also allow for having
differently-compiled xemacsen on the same system (mule and no-mule
comes to mind).
The DOC file contents should probably end up in the dump file.