Mule performance penalty

Wednesday, 9 September 1998

        Hrvoje had trouble believing that the Mule penalty is only 30%.

Obviously it depends on what you're doing.  

There will be places in the sources where switching from non-mule to
mule will change an order-N algorithm to an order-N*N algorithm.  I
fixed such an algorithm in casefiddle.c.  I am sure there are others.
It is quite likely that using a fixed-width internal representation
for characters instead of the current variable-width one is the right
long-term strategy.

Here are some benchmark results:

pentiumpro
pgcc -mcpu=pentiumpro -march=pentiumpro -O6
     -fno-risc -fno-peep-spills -fno-omit-frame-pointer -fno-exceptions

           bytecomp    font-lock
---------------------------------------
Latin-1      42.8        40.3
Mule         75.5        43.7
Latin-1-M    31.4        40.5
Mule-M       38.4        41.6

Ultrasparc, Sun cc -fast -xO5

           bytecomp    font-lock
---------------------------------------
Latin-1      31.0        30.9
Mule         51.8        32.5
Latin-1-M    21.7        31.2
Mule-M       25.9        34.3

Ultrasparc, egcs -O3 -mcpu=ultrasparc

           bytecomp    font-lock
---------------------------------------
Latin-1      31.7        26.5
Mule         55.5        28.0
Latin-1-M    22.0        25.7
Mule-M       25.7        27.5

The bytecomp benchmark is byte-compiling simple.el 20 times.
The font-lock benchmark is font-lock-fontify-buffer-ing redisplay.c 3 times.
The -M suffix means `hacked by Martin'.
Smaller numbers are better.
Only relative values count.

So there may be a significant Mule performance penalty, or maybe not,
depending on what you use xemacs for.

There are some interesting conclusions from the above:

- Sun cc wins on some benchmarks, egcs on others.

- The changes I have made to speed up bytecomp actually slow down
  font-lock a little, but only when using Sun cc (??).  It is
  fortuitous that this accidental change worked this way, since we
  ought to be targeting x86 and egcs as the architecture and compiler
  that more of our users will be using.

- Benchmarking is hard.

I've been concentrating lately on speeding up the bytecomp benchmark,
since it is one of the purest tests of the lisp engine.

Martin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Mule performance penalty