"J. Kean Johnston" <jkj(a)sco.com> writes:
> I would like to enlist the help of an XEmacs internals guru to get
> me over these hurdles, if you think this would be a worthwhile
> exercise. I already have things set up such that compiling modules
> is trivial (and will work the same way on all platforms that support
> dlopen()).
Note that you can achieve greater portability by using Bill's stuff in
sysdll.c.
I'll try to answer some of your questions.
> a) Getting DOC strings for variables defined correctly. When the .so
> is loaded in, a special function is called which does the moral
> equivalent of DEFSUBR() or DEFVAR_LISP() et al. I can see the
> variables defined but I dont know how to go about making the doc
> string visible.
The doc string is a symbol property `variable-documentation'. It can
be either a string, or a character position in the DOC file. XEmacs
itself gets the docstrings done before the dumping; look for
`make-docfile' and friends.
> b) When i DEFSUBR() a function that was declared in the .so, and try
> to execute it, I get a SIGSEGV in Fcommanp when it calls
> SUBRP(fun).
I'm not sure why that happens. Can you trace exactly where the segv
happens?
> There is deep dark stuff I dont understand too well and would like
> to. Such as:
> 1) What is staticpro?
> 2) When do I need to GCPRO and when not? Whats the difference between
> GCPRO and NGCPRO?
To understand GCPRO and staticpro, you need to understand how the
garbage collector works. When you type:
(setq a (cons 1 2)),
you have assigned a fresh cons as the value of symbol `a'. This cons
will remain in existence as long as there is any way to access it from
Lisp. If you type:
(makunbound 'a)
...the cons will silently disappear at the next GC, because there is
no possible way for you to use it any more. However, if you had typed
instead:
(setq b a)
(makunbound 'a)
...the cons would not be freed, because you can still access it
through `b'.
How does this work? The GC uses the "mark-and-sweep" method, which
consists of two basic steps:
1) (mark) Beginning with the roots of accessibility, mark all objects
that are in use;
2) (sweep) Free all the objects *not* marked by step #1.
For #1 to work, every GC-able object has a special "mark" bit used
during gc.
In the above example, the symbols `a' and `b' are parts of the global
`obarray', which is one of the roots of accessibility. So, the
collector maps the obarray, recursively marking each object within it
(except for the objects that are already marked, so there's no
inflooping.)
In the first scenario, marking phase will notice that the "value" slot
of `a' is a Lisp object, and will mark it. This will prevent (1 . 2)
to be garbage-collected during the sweep phase.
When you makunbound `a', its value slot will become empty, nothing
will reference our cons, and it will get swept during gc. Likewise,
when `b' references it, it will get marked, and thus not swept.
Back to GCPRO and staticpro.
staticpro is used by the C code to declare an object as a "root of
accessibility." This is done for all the internal objects that are
invisible from Lisp, and contain Lisp data that must not be gc'ed.
Vobarray is an example of a staticpro'ed object, and so is
Vbuffer_alist in buffer.c.
GCPRO is in a way similar, but for stack-allocated data, as opposed to
global data. The Lisp allocation mechanism is so convenient that when
writing C code, you are often tempted to use it internally. For
instance, it is trivial to use Fcons to create a list of Lisp objects.
Another example is when you create temporary Lisp data to pass to a
function that works with Lisp objects.
In that case, your fresh object will not be visible from Lisp, and you
need to protect it from getting collected if gc happens to occur.
This is done using GCPRO. Example:
void
myfunction (void)
{
Lisp_Object myobject = Fcons (Qnil, Qnil); /* a nice new object */
/* Store two values in MYOBJECT. */
munge_munge (myobject);
/* now do something with those values, accessing them using XCAR
(myobject) and XCDR (myobject). */
... code ...
}
The problem with this code is that it will fail in unexpected ways if
GC happens within munge_munge(), because in that case the cons would
get freed and overwritten by something else at a later time. This
kind of error is very hard to track because you often get crashes in
totally unrelated places.
The correct way to write this function is:
void
myfunction (void)
{
Lisp_Object myobject = Fcons (Qnil, Qnil); /* a nice new object */
struct gcpro gcpro1;
GCPRO1 (myobject);
/* Store two values in MYOBJECT. */
munge_munge (myobject);
/* now do something with those values, accessing them using XCAR
(myobject) and XCDR (myobject). */
... code ...
UNGCPRO;
}
Note that you *must* UNGCPRO for every GCPRO before you return from
the function, or dire things will happen. The only case when you
don't have to UNGCPRO is when an error occurs that throws out of the
function. The error-handling mechanisms has provisions for restoring
the state of GCPRO's.
> 3) Since this is all dynamic, after Emacs has dumped, how do I avoid
> purespace? Do I need to?
I don't understand this. After dumping, you don't use purespace. You
allocate normal memory.
> Can I use xmalloc() for internal data?
Yes, but it's in most cases a bad idea (unless you know what you're
doing), for these reasons:
1) malloc() tends to fragment memory a lot. Use alloca() for
temporary memory wherever you can.
2) Each malloc() needs to be accompanied by a free() lest you create
leaks. This can be tough, because many places in Emacs can throw
(longjmp) under your feet at any time. This means that for almost
every malloc() you need to set up an unwind-protect to free the
memory. Use alloca() for temporary memory whenever you can.
3) Did I mention that alloca() is more favorable than malloc()? :-)
> 4) If there is an existing Lisp or even intern function, can I
> replace its definition with a new one? How? Same thing for
> variables.
Each Lisp symbol has four Lisp-visible slots: name slot, value slot,
function slot, and plist slot. What you call "Lisp function" is
simply a Lisp_Object stored in the symbol's function slot. Likewise,
what you call a "Lisp variable" is simply a Lisp_Object stored in the
variable slot.
So, the answer is: yes, you can replace existing definitions by
overwriting the appropriate symbol slot. In fact, this is exactly
what `defun' and `setq' do in Lisp. The old contents will be gc'ed
because nothing else will point to it...
> Any help at all would be GREATLY appreciated.
I hope this helped.
--
Hrvoje Niksic <hniksic(a)srce.hr> | Student at FER Zagreb, Croatia
--------------------------------+--------------------------------
We are all just prisoners here of our own MAKEDEV.