Updated chapter in internals manual

Saturday, 26 February 2000

Techniques for XEmacs Developers
================================

   To make a purified XEmacs, do: `make puremacs'.  To make a
quantified XEmacs, do: `make quantmacs'.

   You simply can't dump Quantified and Purified images (unless using
the portable dumper).  Purify gets confused when xemacs frees memory in
one process that was allocated in a _different_ process on a different
machine!.  Run it like so:
     temacs -batch -l loadup.el run-temacs XEMACS-ARGS...

   Before you go through the trouble, are you compiling with all
debugging and error-checking off?  If not, try that first.  Be warned
that while Quantify is directly responsible for quite a few
optimizations which have been made to XEmacs, doing a run which
generates results which can be acted upon is not necessarily a trivial
task.

   Also, if you're still willing to do some runs make sure you configure
with the `--quantify' flag.  That will keep Quantify from starting to
record data until after the loadup is completed and will shut off
recording right before it shuts down (which generates enough bogus data
to throw most results off).  It also enables three additional elisp
commands: `quantify-start-recording-data',
`quantify-stop-recording-data' and `quantify-clear-data'.

   If you want to make XEmacs faster, target your favorite slow
benchmark, run a profiler like Quantify, `gprof', or `tcov', and figure
out where the cycles are going.  Specific projects:

   * Make the garbage collector faster.  Figure out how to write an
     incremental garbage collector.

   * Write a compiler that takes bytecode and spits out C code.
     Unfortunately, you will then need a C compiler and a more fully
     developed module system.

   * Speed up redisplay.

   * Speed up syntax highlighting.  Maybe moving some of the syntax
     highlighting capabilities into C would make a difference.

   * Implement tail recursion in Emacs Lisp (hard!).

   Unfortunately, Emacs Lisp is slow, and is going to stay slow.
Function calls in elisp are especially expensive.  Iterating over a
long list is going to be 30 times faster implemented in C than in Elisp.

   Heavily used small code fragments need to be fast.  The traditional
way to implement such code fragments in C is with macros.  But macros
in C are known to be broken.

   Macro arguments that are repeatedly evaluated may suffer from
repeated side effects or suboptimal performance.

   Variable names used in macros may collide with caller's variables,
causing (at least) unwanted compiler warnings.

   In order to solve these problems, and maintain statement semantics,
one should use the `do { ... } while (0)' trick while trying to
reference macro arguments exactly once using local variables.

   Let's take a look at this poor macro definition:

     #define MARK_OBJECT(obj) \
       if (!marked_p (obj)) mark_object (obj), did_mark = 1

   This macro evaluates its argument twice, and also fails if used like
this:
       if (flag) MARK_OBJECT (obj); else do_something();

   A much better definition is

     #define MARK_OBJECT(obj) do { \
       Lisp_Object mo_obj = (obj); \
       if (!marked_p (mo_obj))     \
         {                         \
           mark_object (mo_obj);   \
           did_mark = 1;           \
         }                         \
     } while (0)

   Notice the elimination of double evaluation by using the local
variable with the obscure name.  Writing safe and efficient macros
requires great care.  The one problem with macros that cannot be
portably worked around is, since a C block has no value, a macro used
as an expression rather than a statement cannot use the techniques just
described to avoid multiple evaluation.

   In most cases where a macro has function semantics, an inline
function is a better implementation technique.  Modern compiler
optimizers tend to inline functions even if they have no `inline'
keyword, and configure magic ensures that the `inline' keyword can be
safely used as an additional compiler hint.  Inline functions used in a
single .c files are easy.  The function must already be defined to be
`static'.  Just add another `inline' keyword to the definition.

     inline static int
     heavily_used_small_function (int arg)
     {
       ...
     }

   Inline functions in header files are trickier, because we would like
to make the following optimization if the function is _not_ inlined
(for example, because we're compiling for debugging).  We would like the
function to be defined externally exactly once, and each calling
translation unit would create an external reference to the function,
instead of including a definition of the inline function in the object
code of every translation unit that uses it.  This optimization is
currently only available for gcc.  But you don't have to worry about the
trickiness; just define your inline functions in header files using this
pattern:

     INLINE_HEADER int
     i_used_to_be_a_crufty_macro_but_look_at_me_now (int arg);
     INLINE_HEADER int
     i_used_to_be_a_crufty_macro_but_look_at_me_now (int arg)
     {
       ...
     }

   The declaration right before the definition is to prevent warnings
when compiling with `gcc -Wmissing-declarations'.  I consider issuing
this warning for inline functions a gcc bug, but the gcc maintainers
disagree.

   Every header which contains inline functions, either directly by
using `INLINE_HEADER' or indirectly by using `DECLARE_LRECORD' must be
added to `inline.c''s includes to make the optimization described above
work.  (Optimization note: if all INLINE_HEADER functions are in fact
inlined in all translation units, then the linker can just discard
`inline.o', since it contains only unreferenced code).

   To get started debugging XEmacs, take a look at the `.gdbinit' and
`.dbxrc' files in the `src' directory.  See the section in the XEmacs
FAQ on How to Debug an XEmacs problem with a debugger.

   After making source code changes, run `make check' to ensure that
you haven't introduced any regressions.  If you want to make xemacs more
reliable, please improve the test suite in `tests/automated'.

   Did you make sure you didn't introduce any new compiler warnings?

   Before submitting a patch, please try compiling at least once with

     configure --with-mule --with-union-type --error-checking=all

   Here are things to know when you create a new source file:

   * All `.c' files should `#include <config.h>' first.  Almost all
     `.c' files should `#include "lisp.h"' second.

   * Generated header files should be included using the `#include
     <...>' syntax, not the `#include "..."' syntax.  The
generated
     headers are:

     `config.h sheap-adjust.h paths.h Emacs.ad.h'

     The basic rule is that you should assume builds using `--srcdir'
     and the `#include <...>' syntax needs to be used when the
     to-be-included generated file is in a potentially different
     directory _at compile time_.  The non-obvious C rule is that
     `#include "..."' means to search for the included file in the same
     directory as the including file, _not_ in the current directory.

   * Header files should _not_ include `<config.h>' and
`"lisp.h"'.  It
     is the responsibility of the `.c' files that use it to do so.

   Here is a checklist of things to do when creating a new lisp object
type named FOO:

  1. create FOO.h

  2. create FOO.c

  3. add definitions of `syms_of_FOO', etc. to `FOO.c'

  4. add declarations of `syms_of_FOO', etc. to `symsinit.h'

  5. add calls to `syms_of_FOO', etc. to `emacs.c'

  6. add definitions of macros like `CHECK_FOO' and `FOOP' to `FOO.h'

  7. add the new type index to `enum lrecord_type'

  8. add a DEFINE_LRECORD_IMPLEMENTATION call to `FOO.c'

  9. add an INIT_LRECORD_IMPLEMENTATION call to `syms_of_FOO.c'

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Updated chapter in internals manual