On Thu, Jan 14, 1999 at 02:03:31PM -0800, J. Kean Johnston wrote:
On Thu, Jan 14, 1999 at 06:23:06AM +0100, Olivier Galibert wrote:
> 1- Modules are slow.
>
> Modules must be compiled as a position independant code in order to be
> loadable dynamically. This is particularily bad on x86 where it hogs
> one register for that and hence forces the compiler to generate much
> less efficient code. I plan to fix that for linux in the future, but
> it will be a 2.3 project, and linux isn't the only OS out there.
With respect, I suggest that you re-examine the ABI, and especially
how trampolines and shared loading works. Linux isnt a reference
platform for PIC code, SVR4 is. If Linux handles trampolines incorrectly,
thats Linux's problem. There is NO reason that it should be slower than
other code.
With respect, it is well known that PIC code is slower than non-PIC
code in most architectures. I think the only exception may be mips,
mostly because SGI for irix decided ages ago that "all code must be
pic". Ask your compiler-writer friends.
If this was true, hello.c would be slow, every single X
application (not to mention the X server) would be slow etc, as they
all used shared libraries (libc.so, libX11.so etc).
So ? LibX11 _is_ slow, Motif crawls, and gtk+ isn't much better. The
slowdown due to the PIC code is hidden by the bigger slowdown caused
by the X protocol. Sirius company maybe ?
Did you know that the INN 2.1 (or maybe 2.0, I don't remember exactly)
huge slowdown on linux was due to changing the compilation mode to a
shared library by default?
All of those shared
libraries are compiled in PIC code. Do you REALLY think that PIC is that
bad? Register spilling in a compiler, and the misuse of registers, is
not an "order of magnitude" difference.
On the register-starved architecture that is the x86 and at least with
gcc, it is. Most interesting adressing modes aren't PC-relative too.
Quite often you have to examine
individual instruction cycles to get down to microsecond timing
differences. Its not that big of a deal, and if trampolines are done
properly, there is a one-time overhead on load. Thereafter, the addresses
are bound. Problem solved.
1- You won't be using trampolines but function pointers because it is
dynamically loaded modules we're talking about, and not shared
libraries. OTOH, direct function pointer callings are faster.
2- What we're talking about is function call (trampoline or indirect)
vs. _inline_. The 30% slowdown of mule is mostly due to the fact that
moving to the surrouding characters is more complex than p++ and p--,
and you would want to change that to a _function_call_ ?
> The other problem is how to install two different versions of
XEmacs
> simultaneously. And no, I do not mean to different version numbers.
> I usually have on irix 4 or 5 differently compiled XEmacs executables
> (with/without debugging, with/without mule, with/without gung-ho, and
> my personally hacked versions). All the lisp and lib-src files are
> shared between all of them.
Modules are by their very definition architecture dependant, and as
I said above, the architecture has the XEMcs version as a function of
it. Aside from that, there is still a good chance that modules will load
from version to version, and this will remain true unless there are very
fundamnetal changes within XEmacs, which is what the internal module
version number is for. There is NO argument that modules will tkae a little
bit of extra care if you really want to share multiple versions of XEMacs,
but lets face it, the only people that do that in the REAL world are XEmacs
developers, and they can cope with the small overhead easily.
A lot of people are installing both the MULE and non-MULE version of
xemacs on the same system. MULE and non-MULE modules won't be
compatible at all.
> 3- Modules and undumping aren't really compatible
>
> Undumping requires loading a lot of lisp. Said lisp uses most of the
> functions that would end up in modules, hence all the modules will
> have to be loaded. Worse, the data and bss sections of these modules
> will have references to the lisp code. Do we want to write a module
> undumper?
Eh? I dont understand this objection at all. Why do you want to undump
something that is dynamic? Please expand on this more so that I can
understand the objection / limitation.
XEmacs isn't an editor written in C. XEmacs is written in emacs lisp.
The C part is constituted of:
- the core lisp engine
- all the basic tasks involving the environment (accessing files,
drawing to screen...)
- some functions that could have been written in lisp, but would have
been too slow as a result
Part of the building process of XEmacs is called the undumping. A
bunch of lisp is loaded which "plugs" into the C level through static
variables (the staticpro'ed ones) and then a new executable is created
which includes said lisp.
If you use modules for parts of the engine, you will need to find some
way to link back said static variables to the lisp code. I wouldn't
even know where to start.
> 4- Bloat is in the eye of the beholder
>
> Sorry to say that, but "I dont buy the argument that everything that
> is unused is swapped either" shows a lack of understanding of modern
> VMs. Unused code or data isn't swapped, it isn't loaded at all. And
> code pages or unmodified data pages aren't swapped but discarded.
Sir, I understand modern VM's very well. I have helped write one
of the
most sophisticated and modern ones around (except for MPE - thats still
the most kick-ass OS I have ever seen). Yes, the VM will swap out pages
when it needs to, but there's the rub, do we REALLY want to force the VM
into that state if we dont need to? Netscape does the Wrong Thing by
having such a monolithic binary, and XEmacs is headed the same way. The
entire UNIX philosophy is to keep things small, and grow on demand, not
start huge and discard on demand.
Sir, if your VM is loading the full executable in memory at the start
and then discard what is unused, you have a problem there. Also, I am
sure that you know the not so subtle difference between "swapping out"
and "discarding", so it would be nicer that you stop using them
interchangibly. Stop writing as if you don't know what you're talking
about, and I won't consider that it is the case anymore.
For the people out there who don't understand modern VMs very well,
let me outline two scenarios where a user wants to use for the first
time a DB function:
a- monolithic executable:
- the function is called
- page fault, the code is loaded in memory. A bit of read-ahead of
the code is done before and after the function
- the function runs
b- module
- the program sees that the function is not in memory
- the module is found and the function table is loaded (but not the
function code itself, usually)
- the function pointers are updated
- the function is eventually called. Go to a.
Guess who's fastest ?
OG.