changeset: 5294:2cc24c69446c
user: Marcus Crestani <crestani(a)informatik.uni-tuebingen.de>
date: Mon Jul 05 18:17:39 2010 +0200
files: src/ChangeLog src/gc.c src/mc-alloc.c
description:
Document the new allocator and the new garbage collector in gc.c and mc-alloc.c.
diff -r 6466bc9ebf15 -r 2cc24c69446c src/ChangeLog
--- a/src/ChangeLog Wed Jun 23 08:04:18 2010 -0400
+++ b/src/ChangeLog Mon Jul 05 18:17:39 2010 +0200
@@ -1,3 +1,9 @@
+2010-06-05 Marcus Crestani <crestani(a)informatik.uni-tuebingen.de>
+
+ * gc.c:
+ * mc-alloc.c:
+ Document the new allocator and the new garbage collector.
+
2010-06-21 Jeff Sparkes <jsparkes(a)gmail.com>
* console-x-impl.h (DEVICE_X_XFTDRAW): Define, instead of
diff -r 6466bc9ebf15 -r 2cc24c69446c src/gc.c
--- a/src/gc.c Wed Jun 23 08:04:18 2010 -0400
+++ b/src/gc.c Mon Jul 05 18:17:39 2010 +0200
@@ -20,6 +20,318 @@
Boston, MA 02111-1307, USA. */
/* Synched up with: Not in FSF. */
+
+/*
+ Garbage Collectors in XEmacs
+
+ Currently, XEmacs comes with two garbage collectors:
+
+ - The "old garbage collector": a simple mark and sweep collector,
+ its implementation is mainly spread out over gc.c and alloc.c.
+ It is used by the default configuration or if you configure
+ `--with-newgc=no'.
+
+ - The "new garbage collector": an incremental mark and sweep collector,
+ its implementation is in gc.c. It is used if you configure
+ `--with-newgc'. It comes with a new allocator, see mc-alloc.c, and
+ with the KKCC mark algorith, see below.
+
+ Additionally, the old garbage collectors comes with two mark algorithms:
+
+ - The "recursive mark algorithm" marks live objects by recursively
+ calling mark_* functions on live objects. It is the default mark
+ algorithm of the old garbage collector.
+
+ - The "KKCC mark algorithm" uses an explicit stack that to keep
+ track of the current progress of traversal and uses memory layout
+ descriptions (that are also used by the portable dumper) instead
+ of the mark_* functions. The old garbage collector uses it if
+ you configure `--with-kkcc'. It is the default and only mark
+ algorithm of the new garbage collector.
+
+
+ The New Incremental Garbage Collector
+
+ An incremental garbage collector keeps garbage collection pause
+ times short by interleaving small amounts of collection work with
+ program execution, it does that by instrumenting write barrier
+ algorithms that essentially allow interrupting the mark phase.
+
+
+ Write Barrier
+
+ A write barrier is the most important prerequisite for fancy
+ garbage collection techniques. We implement a "Virtual Dirty Bit
+ (short: vdb) Write Barrier" that makes uses of the operating
+ system's memory-protection mechanisms: The write barrier
+ write-protects memory pages containing heap objects. If the
+ mutator tries to modify these objects by writing into the
+ write-protected page, the operating system generates a fault. The
+ write barrier catches this fault, reads out the error-causing
+ address and can thus identify the updated object and page.
+
+ Not all environments and operating systems provide the mechanism to
+ write-protect memory, catch resulting write faults, and read out
+ the faulting address. But luckily, most of today's operating
+ systems provide the features needed for the write-barrier
+ implementation. Currently, XEmacs includes write-barrier
+ implementations for the following platforms:
+
+ - POSIX-compliant platforms like up-to-date UNIX, Linux, Solaris,
+ etc. use the system call `mprotect' for memory protection,
+ `sigaction' for signal handling and get the faulting address from
+ `struct siginfo'. See file vdb-posix.c.
+
+ - Mach-based systems like Mac OS X use "Mach Exception Handlers".
+ See file vdb-mach.c.
+
+ - Windows systems like native Windows and Cygwin use Microsoft's
+ so-called "Structured Exception Handling". See file vdb-win32.c.
+
+ The configure script determines which write barrier implementation
+ to use for a system. If no write barrier implementation is working
+ on that system, a fall-back "fake" implementation is used: This
+ implementation simply turns of the incremental write barrier at
+ runtime and does not allow any incremental collection (see
+ vdb-fake.c). The garbage collector then acts like a traditional
+ mark-and-sweep garbage collector. Generally, the incremental
+ garbage collector can be turned of at runtime by the user or by
+ applications, see below.
+
+
+ Memory Protection and Object Layout
+
+ Implementations of a memory-protection mechanism may restrict the
+ size and the alignment of the memory region to be on page-size
+ boundaries. All objects subject to be covered by the write barrier
+ have to be allocated on logical memory pages, so that they meet the
+ requirement to be write-protected. The new allocator mc-alloc is
+ aware of a system page size---it allocates all Lisp objects on
+ logical memory pages and is therefore defaulted to on when the new
+ garbage collector is enabled.
+
+ Unfortunately, the Lisp object layout that works with the old
+ collector leads to holes in the write barrier: Not all data
+ structures containing pointers to Lisp objects are allocated on the
+ Lisp heap. Some Lisp objects do not carry all their information in
+ the object itself. External parts are kept in separately allocated
+ memory blocks that are not managed by the new Lisp allocator.
+ Examples for these objects are hash tables and dynamic arrays, two
+ objects that can dynamically grow and shrink. The separate memory
+ blocks are not guaranteed to reside on page boundaries, and thus
+ cannot be watched by the write barrier.
+
+ Moreover, the separate parts can contain live pointers to other Lisp
+ objects. These pointers are not covered by the write barrier and
+ modifications by the client during garbage collection do escape. In
+ this case, the client changes the connectivity of the reachability
+ graph behind the collector's back, which eventually leads to
+ erroneous collection of live objects. To solve this problem, I
+ transformed the separately allocated parts to fully qualified Lisp
+ objects that are managed by the allocator and thus are covered by
+ the write barrier. This also removes a lot of special allocation
+ and removal code for the out-sourced parts. Generally, allocating
+ all data structures that contain pointers to Lisp objects on one
+ heap makes the whole memory layout more consistent.
+
+
+ Debugging
+
+ The virtual-dirty-bit write barrier provokes signals on purpose,
+ namely SIGSEGV and SIGBUS. When debugging XEmacs with this write
+ barrier running, the debugger always breaks whenever a signal
+ occurs. This behavior is generally desired: A debugger has to break
+ on signals, to allow the user to examine the cause of the
+ signal---especially for illegal memory access, which is a common
+ programming error. But the debugger should not break for signals
+ caused by the write barrier. Therefore, most debuggers provide the
+ ability to turn of their fault handling for specific signals. The
+ configure script generates the debugger's settings .gdbinit and
+ .dbxrc, adding code to turn of signal handling for SIGSEGV and
+ SIGBUS, if the new garbage collector is used.
+
+ But what happens if a bug in XEmacs causes an illegal memory access?
+ To maintain basic debugging abilities, we use another signal: First,
+ the write-barrier signal handler has to determine if the current
+ error situation is caused by the write-barrier memory protection or
+ not. Therefore, the signal handler checks if the faulting address
+ has been write-protected before. If it has not, the fault is caused
+ by a bug; the debugger has to break in this situation. To achieve
+ this, the signal handler raises SIGABRT to abort the program. Since
+ SIGABRT is not masked out by the debugger, XEmacs aborts and allows
+ the user to examine the problem.
+
+
+ Incremental Garbage Collection
+
+ The new garbage collector is still a mark-and-sweep collector, but
+ now the mark phase no longer runs in one atomic action, it is
+ interleaved with program execution. The incremental garbage
+ collector needs an explicit mark stack to store the state of the
+ incremental traversal: the KKCC mark algorithm is a prerequisite and
+ is enabled by default when the new garbage collector is on.
+
+ Garbage collection is invoked as before: After `gc-cons-threshold'
+ bytes have been allocated since the last garbage collection (or
+ after `gc-cons-percentage' percentage of the total amount of memory
+ used for Lisp data has been allocated since the last garbage
+ collection) a collection starts. After some initialization, the
+ marking begins.
+
+ The variable `gc-incremental-traversal-threshold' contains how many
+ steps of incremental work have to be executed in one incremental
+ traversal cycle. After that many steps have been made, the mark
+ phase is interrupted and the client resumes. Now, the Lisp memory
+ is write-protected and the write barrier records modified objects.
+ Incremental traversal is resumed after
+ `gc-cons-incremental-threshold' bytes have been allocated since the
+ interruption of garbage collection. Then, the objects recorded by
+ the write-barrier have to be re-examined by the traversal, i.e. they
+ are re-pushed onto the mark stack and processed again. Once the
+ mark stack is empty, the traversal is done.
+
+ A full incremental collection is slightly slower than a full garbage
+ collection before: There is an overhead for storing pointers into
+ objects when the write barrier is running, and an overhead for
+ repeated traversal of modified objects. However, the new
+ incremental garbage collector reduces client pause times to
+ one-third, so even when a garbage collection is running, XEmacs
+ stays reactive.
+
+
+ Tricolor Marking: White, Black, and Grey Mark Bits
+
+ Garbage collection traverses the graph of reachable objects and
+ colors them. The objects subject to garbage collection are white at
+ the beginning. By the end of the collection, those that will be
+ retained are colored black. When there are no reachable objects left
+ to blacken, the traversal of live data structures is finished. In
+ traditional mark-and-sweep collectors, this black and white coloring
+ is sufficient.
+
+ In an incremental collector, the intermediate state of the traversal
+ is im- portant because of ongoing mutator activity: the mutator
+ cannot be allowed to change things in such way that the collector
+ will fail to find all reachable objects. To understand and prevent
+ such interactions between the mutator and the collector, it is
+ useful to introduce a third color, grey.
+
+ Grey objects have been reached by the traversal, but its descendants
+ may not have been. White objects are changed to grey when they are
+ reached by the traversal. Grey objects mark the current state of the
+ traversal: traversal pro- ceeds by processing the grey objects. The
+ KKCC mark stack holds all the currently grey-colored objects.
+ Processing a grey object means following its outgoing pointers, and
+ coloring it black afterwards.
+
+ Intuitively, the traversal proceeds in a wavefront of grey objects
+ that separates the unreached objects, which are colored white, from
+ the already processed black objects.
+
+ The allocator takes care of storing the mark bits: The mark bits are
+ kept in a tree like structure, for details see mc-alloc.c.
+
+
+ Internal States of the Incremental Garbage Collector
+
+ To keep track of its current state, the collector holds it's current
+ phase in the global `gc_state' variable. A collector phase is one
+ of the following:
+
+ NONE No incremental or full collection is currently running.
+
+ INIT_GC The collector prepares for a new collection, e.g. sets some
+ global variables.
+
+ PUSH_ROOT_SET The collector pushes the root set on the mark stack
+ to start the traversal of live objects.
+
+ MARK The traversal of live objects colors the reachable objects
+ white, grey, or black, according to their lifeness. The mark
+ phase can be interrupted by the incremental collection algorithm:
+ Before the client (i.e. the non collector part of XEmacs) resumes,
+ the write barrier has to be installed so that the collector knows
+ what objects get modified during the collector's pause.
+ Installing a write barrier means protecting pages that only
+ contain black objects and recording write access to these objects.
+ Pages with white or grey objects do not need to be protected,
+ since these pages are due to marking anyways when the collector
+ resumes. Once the collector resumes, it has to re-scan all
+ objects that have been modified during the collector pause and
+ have been caught by the write barrier. The mark phase is done when
+ there are no more grey objects on the heap, i.e. the KKCC mark stack
+ is empty.
+
+ REPUSH_ROOT_SET After the mark phase is done, the collector has to
+ traverse the root set pointers again, since modifications to the
+ objects in the root set can not all be covered by the write barrier
+ (e.g. root set objects that are on the call stack). Therefore, the
+ collector has to traverse the root set again without interruption.
+
+ FINISH_MARK After the mark phase is finished, some objects with
+ special liveness semantics have to be treated separately, e.g.
+ ephemerons and the various flavors of weak objects.
+
+ FINALIZE The collector registers all objects that have finalizers
+ for finalization. Finalizations happens asynchronously sometimes
+ after the collection has finished.
+
+ SWEEP The allocator scans the entire heap and frees all white marked
+ objects. The freed memory is recycled and can be re-used for future
+ allocations. The sweep phase is carried out atomically.
+
+ FINISH_GC The collector cleans up after the garbage collection by
+ resetting some global variables.
+
+
+ Lisp Interface
+
+ The new garbage collector can be accessed directly from Emacs Lisp.
+ Basically, two functions invoke the garbage collector:
+
+ (gc-full) starts a full garbage collection. If an incremental
+ garbage collection is already running, it is finished without
+ further interruption. This function guarantees that unused
+ objects have been freed when it returns.
+
+ (gc-incremental) starts an incremental garbage collection. If an
+ incremental garbage collection is already running, the next cycle
+ of incremental traversal is started. The garbage collection is
+ finished if the traversal completes. Note that this function does
+ not necessarily free any memory. It only guarantees that the
+ traversal of the heap makes progress.
+
+ The old garbage collector uses the function (garbage-collect) to
+ invoke a garbage collection. This function is still in use by some
+ applications that explicitly want to invoke a garbage collection.
+ Since these applications may expect that unused memory has really
+ been freed when (garbage-collect) returns, it maps to (gc-full).
+
+ The new garbage collector is highly customizable during runtime; it
+ can even be switched back to the traditional mark-and-sweep garbage
+ collector: The variable allow-incremental-gc controls whether
+ garbage collections may be interrupted or if they have to be carried
+ out in one atomic action. Setting allow-incremental-gc to nil
+ prevents incremental garbage collection, and the garbage collector
+ then only does full collects, even if (gc-incremental) is called.
+ Non-nil allows incremental garbage collection.
+
+ This way applications can freely decide what garbage collection
+ algorithm is best for the upcoming memory usage. How frequently a
+ garbage collection occurs and how much traversal work is done in one
+ incremental cycle can also be modified during runtime. See
+
+ M-x customize RET alloc RET
+
+ for an overview of all settings.
+
+
+ More Information
+
+ More details can be found in
+
http://crestani.de/xemacs/pdf/thesis-newgc.pdf .
+
+*/
#include <config.h>
#include "lisp.h"
@@ -50,8 +362,14 @@
#include "vdb.h"
+/* Number of bytes of consing since gc before a full gc should happen. */
#define GC_CONS_THRESHOLD 2000000
+
+/* Number of bytes of consing since gc before another cycle of the gc
+ should happen in incremental mode. */
#define GC_CONS_INCREMENTAL_THRESHOLD 200000
+
+/* Number of elements marked in one cycle of incremental GC. */
#define GC_INCREMENTAL_TRAVERSAL_THRESHOLD 100000
/* Number of bytes of consing done since the last GC. */
diff -r 6466bc9ebf15 -r 2cc24c69446c src/mc-alloc.c
--- a/src/mc-alloc.c Wed Jun 23 08:04:18 2010 -0400
+++ b/src/mc-alloc.c Mon Jul 05 18:17:39 2010 +0200
@@ -20,6 +20,227 @@
Boston, MA 02111-1307, USA. */
/* Synched up with: Not in FSF. */
+
+/*
+ The New Allocator
+
+ The ideas and algorithms are based on the allocator of the
+ Boehm-Demers-Weiser conservative garbage collector. See
+
http://www.hpl.hp.com/personal/Hans_ Boehm/gc/index.html.
+
+ The new allocator is enabled when the new garbage collector
+ is enabled (with `--with-newgc'). The implementation of
+ the new garbage collector is in gc.c.
+
+ The new allocator takes care of:
+ - allocating objects in a write-barrier-friendly way
+ - manage object's mark bits
+
+ Three-Level Allocation
+
+ The new allocator efficiently manages the allocation of Lisp
+ objects by minimizing the number of times malloc() and free() are
+ called. The allocation process has three layers of abstraction:
+
+ 1. It allocates memory in very large chunks called heap sections.
+
+ 2. The heap sections are subdivided into pages. The page size is
+ determined by the constant PAGE_SIZE. It holds the size of a page
+ in bytes.
+
+ 3. One page consists of one or more cells. Each cell represents
+ a memory location for an object. The cells on one page all have
+ the same size, thus every page only contains equal-sized
+ objects.
+
+ If an object is bigger than page size, it is allocated on a
+ multi-page. Then there is only one cell on a multi-page (the cell
+ covers the full multi-page). Is an object smaller than 1/2 PAGE_SIZE,
+ a page contains several objects and several cells. There
+ is only one cell on a page for object sizes from 1/2 PAGE_SIZE to
+ PAGE_SIZE (whereas multi-pages always contain 2 only one
+ cell). Only in layer one malloc() and free() are called.
+
+
+ Size Classes and Page Lists
+
+ Meta-information about every page and multi-page is kept in a page
+ header. The page header contains some bookkeeping information like
+ number of used and free cells, and pointers to other page
+ headers. The page headers are linked in a page list.
+
+ Every page list builds a size class. A size class contains all
+ pages (linked via page headers) for objects of the same size. The
+ new allocator does not group objects based on their type, it groups
+ objects based on their sizes.
+
+ Here is an example: A cons contains a lrecord_header, a car and cdr
+ field. Altogether it uses 12 bytes of memory (on 32 bits
+ machines). All conses are allocated on pages with a cell size of 12
+ bytes. All theses pages are kept together in a page list, which
+ represents the size class for 12 bytes objects. But this size class
+ is not exclusively for conses only. Other objects, which are also
+ 12 bytes big (e.g. weak-boxes), are allocated in the same size
+ class and on the same pages.
+
+ The number of size classes is customizable, so is the size step
+ between successive size classes.
+
+
+ Used and Unused Heap
+
+ The memory which is managed by the allocator can be divided in two
+ logical parts:
+
+ The used heap contains pages, on which objects are allocated. These
+ pages are com- pletely or partially occupied. In the used heap, it
+ is important to quickly find a free spot for a new
+ object. Therefore the size classes of the used heap are defined by
+ the size of the cells on the pages. The size classes should match
+ common object sizes, to avoid wasting memory.
+
+ The unused heap only contains completely empty pages. They have
+ never been used or have been freed completely again. In the unused
+ heap, the size of consecutive memory tips the scales. A page is the
+ smallest entity which is asked for. Therefore, the size classes of
+ the unused heap are defined by the number of consecutive pages.
+
+ The parameters for the different size classes can be adjusted
+ independently, see `configurable values' below.
+
+
+ The Allocator's Data Structures
+
+ The struct `mc_allocator_globals holds' all the data structures
+ that the new allocator uses (lists of used and unused pages, mark
+ bits, etc.).
+
+
+ Mapping of Heap Pointers to Page Headers
+
+ For caching benefits, the page headers and mark bits are stored
+ separately from their associated page. During garbage collection
+ (i.e. for marking and freeing objects) it is important to identify
+ the page header which is responsible for a given Lisp object.
+
+ To do this task quickly, I added a two level search tree: the upper
+ 10 bits of the heap pointer are the index of the first level. This
+ entry of the first level links to the second level, where the next
+ 10 bits of the heap pointer are used to identify the page
+ header. The remaining bits point to the object relative to the
+ page.
+
+ On architectures with more than 32 bits pointers, a hash value of
+ the upper bits is used to index into the first level.
+
+
+ Mark Bits
+
+ For caching purposes, the mark bits are no longer kept within the
+ objects, they are kept in a separate bit field.
+
+ Every page header has a field for the mark bits of the objects on
+ the page. If there are less cells on the page than there fit bits
+ in the integral data type EMACS_INT, the mark bits are stored
+ directly in this EMACS_INT.
+
+ Otherwise, the mark bits are written in a separate space, with the
+ page header pointing to this space. This happens to pages with
+ rather small objects: many cells fit on a page, thus many mark bits
+ are needed.
+
+
+ Allocate Memory
+
+ Use
+ void *mc_alloc (size_t size)
+ to request memory from the allocator. This returns a pointer to a
+ newly allocated block of memory of given size.
+
+ This is how the new allocator allocates memory:
+ 1. Determine the size class of the object.
+ 2. Is there already a page in this size class and is there a free
+ cell on this page?
+ * YES
+ 3. Unlink free cell from free list, return address of free cell.
+ DONE.
+ * NO
+ 3. Is there a page in the unused heap?
+ * YES
+ 4. Move unused page to used heap.
+ 5. Initialize page header, free list, and mark bits.
+ 6. Unlink first cell from free list, return address of cell.
+ DONE.
+ * NO
+ 4. Expand the heap, add new memory to unused heap
+ [go back to 3. and proceed with the YES case].
+
+ The allocator puts partially filled pages to the front of the page
+ list, completely filled ones to the end. That guarantees a fast
+ terminating search for free cells. Are there two successive full
+ pages at the front of the page list, the complete size class is
+ full, a new page has to be added.
+
+
+ Expand Heap
+
+ To expand the heap, a big chunk of contiguous memory is allocated
+ using malloc(). These pieces are called heap sections. How big a new
+ heap section is (and thus the growth of the heap) is adjustable: See
+ MIN_HEAP_INCREASE, MAX_HEAP_INCREASE, and HEAP_GROWTH_DIVISOR below.
+
+
+ Free Memory
+
+ One optimization in XEmacs is that locally used Lisp objects are
+ freed manually (the memory is not wasted till the next garbage
+ collection). Therefore the new allocator provides this function:
+ void mc_free (void *ptr)
+ That frees the object pointed to by ptr.
+
+ This function is also used internally during sweep phase of the
+ garbage collection. This is how it works in detail:
+
+ 1. Use pointer to identify page header
+ (use lookup mechanism described above).
+ 2. Mark cell as free and hook it into free list.
+ 3. Is the page completely empty?
+ * YES
+ 4. Unlink page from page list.
+ 5. Remove page header, free list, and mark bits.
+ 6. Move page to unused heap.
+ * NO
+ 4. Move page to front of size class (to speed up allocation
+ of objects).
+
+ If the last object of a page is freed, the empty page is returned to
+ the unused heap. The allocator tries to coalesce adjacent pages, to
+ gain a big piece of contiguous memory. The resulting chunk is hooked
+ into the according size class of the unused heap. If this created a
+ complete heap section, the heap section is returned to the operating
+ system by using free().
+
+
+ Allocator and Garbage Collector
+
+ The new allocator simplifies the interface to the Garbage Collector:
+ * mark live objects: MARK_[WHITE|GREY|BLACK] (ptr)
+ * sweep heap: EMACS_INT mc_sweep (void)
+ * run finalizers: EMACS_INT mc_finalize (void)
+
+
+ Allocator and Dumper
+
+ The new allocator provides special finalization for the portable
+ dumper (to save disk space): EMACS_INT mc_finalize_for_disksave (void)
+
+
+ More Information
+
+ More details can be found in
+
http://crestani.de/xemacs/pdf/mc-alloc.pdf .
+
+*/
#include <config.h>
_______________________________________________
XEmacs-Patches mailing list
XEmacs-Patches(a)xemacs.org
http://lists.xemacs.org/mailman/listinfo/xemacs-patches