This file contains a RealAudio attachment. If you can't understand it, try
http://www.666.com/audio/mail/19980706-1-gsm-8k.wav
see also
http://cvs.xemacs.org/cgi-bin/xecvswebe/xemacs-20/src/alloc.c?rev=1.42
/**********************************************************************/
/* Fixed-size type macros */
/**********************************************************************/
/* For fixed-size types that are commonly used, we malloc() large blocks
of memory at a time and subdivide them into chunks of the correct
size for an object of that type. This is more efficient than
malloc()ing each object separately because we save on malloc() time
and overhead due to the fewer number of malloc()ed blocks, and
also because we don't need any extra pointers within each object
to keep them threaded together for GC purposes. For less common
(and frequently large-size) types, we use lcrecords, which are
malloc()ed individually and chained together through a pointer
in the lcrecord header. lcrecords do not need to be fixed-size
(i.e. two objects of the same type need not have the same size;
however, the size of a particular object cannot vary dynamically).
It is also much easier to create a new lcrecord type because no
additional code needs to be added to alloc.c. Finally, lcrecords
may be more efficient when there are only a small number of them.
The types that are stored in these large blocks (or "frob blocks")
are cons, float, compiled-function, symbol, marker, extent, event,
and string.
Note that strings are special in that they are actually stored in
two parts: a structure containing information about the string, and
the actual data associated with the string. The former structure
(a struct Lisp_String) is a fixed-size structure and is managed the
same way as all the other such types. This structure contains a
pointer to the actual string data, which is stored in structures of
type struct string_chars_block. Each string_chars_block consists
of a pointer to a struct Lisp_String, followed by the data for that
string, followed by another pointer to a struct Lisp_String,
followed by the data for that string, etc. At GC time, the data in
these blocks is compacted by searching sequentially through all the
blocks and compressing out any holes created by unmarked strings.
Strings that are more than a certain size (bigger than the size of
a string_chars_block, although something like half as big might
make more sense) are malloc()ed separately and not stored in
string_chars_blocks. Furthermore, no one string stretches across
two string_chars_blocks.
Vectors are each malloc()ed separately, similar to lcrecords.
In the following discussion, we use conses, but it applies equally
well to the other fixed-size types.
We store cons cells inside of cons_blocks, allocating a new
cons_block with malloc() whenever necessary. Cons cells reclaimed
by GC are put on a free list to be reallocated before allocating
any new cons cells from the latest cons_block. Each cons_block is
just under 2^n - MALLOC_OVERHEAD bytes long, since malloc (at least
the versions in malloc.c and gmalloc.c) really allocates in units
of powers of two and uses 4 bytes for its own overhead.
What GC actually does is to search through all the cons_blocks,
from the most recently allocated to the oldest, and put all
cons cells that are not marked (whether or not they're already
free) on a cons_free_list. The cons_free_list is a stack, and
so the cons cells in the oldest-allocated cons_block end up
at the head of the stack and are the first to be reallocated.
If any cons_block is entirely free, it is freed with free()
and its cons cells removed from the cons_free_list. Because
the cons_free_list ends up basically in memory order, we have
a high locality of reference (assuming a reasonable turnover
of allocating and freeing) and have a reasonable probability
of entirely freeing up cons_blocks that have been more recently
allocated. This stage is called the "sweep stage" of GC, and
is executed after the "mark stage", which involves starting
from all places that are known to point to in-use Lisp objects
(e.g. the obarray, where are all symbols are stored; the
current catches and condition-cases; the backtrace list of
currently executing functions; the gcpro list; etc.) and
recursively marking all objects that are accessible.
At the beginning of the sweep stage, the conses in the cons
blocks are in one of three states: in use and marked, in use
but not marked, and not in use (already freed). Any conses
that are marked have been marked in the mark stage just
executed, because as part of the sweep stage we unmark any
marked objects. The way we tell whether or not a cons cell
is in use is through the FREE_STRUCT_P macro. This basically
looks at the first 4 bytes (or however many bytes a pointer
fits in) to see if all the bits in those bytes are 1. The
resulting value (0xFFFFFFFF) is not a valid pointer and is
not a valid Lisp_Object. All current fixed-size types have
a pointer or Lisp_Object as their first element with the
exception of strings; they have a size value, which can
never be less than zero, and so 0xFFFFFFFF is invalid for
strings as well. Now assuming that a cons cell is in use,
the way we tell whether or not it is marked is to look at
the mark bit of its car (each Lisp_Object has one bit
reserved as a mark bit, in case it's needed). Note that
different types of objects use different fields to indicate
whether the object is marked, but the principle is the same.
Conses on the free_cons_list are threaded through a pointer
stored in the bytes directly after the bytes that are set
to 0xFFFFFFFF (we cannot overwrite these because the cons
is still in a cons_block and needs to remain marked as
not in use for the next time that GC happens). This
implies that all fixed-size types must be at least big
enough to store two pointers, which is indeed the case
for all current fixed-size types.
Some types of objects need additional "finalization" done
when an object is converted from in use to not in use;
this is the purpose of the ADDITIONAL_FREE_type macro.
For example, markers need to be removed from the chain
of markers that is kept in each buffer. This is because
markers in a buffer automatically disappear if the marker
is no longer referenced anywhere (the same does not
apply to extents, however).
WARNING: Things are in an extremely bizarre state when
the ADDITIONAL_FREE_type macros are called, so beware!
When ERROR_CHECK_GC is defined, we do things differently
so as to maximize our chances of catching places where
there is insufficient GCPROing. The thing we want to
avoid is having an object that we're using but didn't
GCPRO get freed by GC and then reallocated while we're
in the process of using it -- this will result in something
seemingly unrelated getting trashed, and is extremely
difficult to track down. If the object gets freed but
not reallocated, we can usually catch this because we
set all bytes of a freed object to 0xDEADBEEF. (The
first four bytes, however, are 0xFFFFFFFF, and the next
four are a pointer used to chain freed objects together;
we play some tricks with this pointer to make it more
bogus, so crashes are more likely to occur right away.)
We want freed objects to stay free as long as possible,
so instead of doing what we do above, we maintain the
free objects in a first-in first-out queue. We also
don't recompute the free list each GC, unlike above;
this ensures that the queue ordering is preserved.
[This means that we are likely to have worse locality
of reference, and that we can never free a frob block
once it's allocated. (Even if we know that all cells
in it are free, there's no easy way to remove all those
cells from the free list because the objects on the
free list are unlikely to be in memory order.)]
Furthermore, we never take objects off the free list
unless there's a large number (usually 1000, but
varies depending on type) of them already on the list.
This way, we ensure that an object that gets freed will
remain free for the next 1000 (or whatever) times that
an object of that type is allocated.
*/
#define SWEEP_FIXED_TYPE_BLOCK(typename, obj_type) \
do { \
struct typename##_block *_frob_current; \
struct typename##_block **_frob_prev; \
int _frob_limit; \
int num_free =3D 0, num_used =3D 0; \
\
typename##_free_list =3D 0; \
\
for (_frob_prev =3D ¤t_##typename##_block, \
_frob_current =3D current_##typename##_block, \
_frob_limit =3D current_##typename##_block_index; \
_frob_current; \
) \
{ \
int _frob_iii; \
int _frob_empty =3D 1; \
obj_type *_frob_old_free_list =3D typename##_free_list; \
\
for (_frob_iii =3D 0; _frob_iii < _frob_limit; _frob_iii++)
=
\
{ \
obj_type *_frob_victim =3D &(_frob_current->block[_frob_iii]);
\=
\
if (FREE_STRUCT_P (_frob_victim)) \
{ \
num_free++; \
PUT_FIXED_TYPE_ON_FREE_LIST (typename, obj_type, _frob_victim); \
} \
else if (!MARKED_##typename##_P (_frob_victim)) \
{ \
num_free++; \
FREE_FIXED_TYPE (typename, obj_type, _frob_victim); \
} \
else \
{ \
_frob_empty =3D 0; \
num_used++; \
UNMARK_##typename (_frob_victim); \
} \
} \
if (!_frob_empty) \
{ \
_frob_prev =3D &(_frob_current->prev);
\
_frob_current =3D _frob_current->prev;
\
} \
else if (_frob_current =3D=3D current_##typename##_block \
&& !_frob_current->prev)
\
{ \
/* No real point in freeing sole allocation block */ \
break; \
} \
else \
{ \
struct typename##_block *_frob_victim_block =3D _frob_current; \=
if (_frob_victim_block =3D=3D current_##typename##_block) \
current_##typename##_block_index \
=3D countof (current_##typename##_block->block); \
_frob_current =3D _frob_current->prev;
\
{ \
*_frob_prev =3D _frob_current; \
xfree (_frob_victim_block); \
/* Restore free list to what it was before victim was swept */ \
typename##_free_list =3D _frob_old_free_list; \
num_free -=3D _frob_limit; \
} \
} \
_frob_limit =3D countof (current_##typename##_block->block); =
\
} \
\
gc_count_num_##typename##_in_use =3D num_used; \
gc_count_num_##typename##_freelist =3D num_free; \
} while (0)
Kenichi Okuyama wrote:
>>>>> "HN" == Hrvoje Niksic
<hniksic(a)srce.hr> writes:
>> I'm intrested in xemacs newest version, especially on part of
>> Garbage Collection. And I would like to know if there is any
>> changes made ever since 19.28. Is it re-written ( with basically
>> same algorithm? ), or is it totally new?
HN> There has never been any such thing as XEmacs 19.28.
Yes, this I know. I mean FSF-emacs 19.28 ( or FSFmacs 19.28 ? is
this how you call it? ).
>>
http://nagahashi-www.isl.titech.ac.jp/~okuyama/html/program/Mule-2.3-allo...
HN> I was unable to access this
I was making mistake.
http://nagahashi-www.isl.titech.ac.jp/~okuyama/html/programs/Mule-2.3-all...
^
I'm sorry for this.
HN> You will have to do your homework and actually look at
HN> XEmacs src/alloc.c.
HN> The latest released version: 20.4
HN> The latest pre-21.0 beta: 21.0-b45/pre2 "Thuringian"
Does these two version's alloc.c differs?
If not, I'll try on 20.4 (because it must be easier to get).
----
Kenichi Okuyama @ Tokyo Research Lab. IBM. Co. Japan.
--
Ben Wing
- If this message is long and typed, someone else typed it for me
- If it has a .rm (RealAudio) attachment, see
http://www.real.com/products/player/downloadrealplayer.html?wp=dl0498
- If this doesn't work and there's a GSM WAV attachment, see
http://xanim.va.pubnix.com/home.html