I have been looking back at the recent discussions about
slowness of font-lock and syntax properties and I
didn't see any conclusion being reached. If so I missed it and
this will hopefully still be a decent summary.
The evidence:
[Eparnaud]
1. Load the TraverseSchema.java file from Xerces that Stefan
Eparnaud posted to the list in xemacs-vanilla
2. M-x font-lock-fontify-buffer
3. M-x goto-char 413514
4. C-j
This reindents a function argument spilled over to the next line.
Om my box (with some minor improvements mentioned below):
lookup-syntax-properties == t : 13.5 seconds
lookup-syntax-properties == nil : 1.4 seconds
[James]
Jerry James reports slow font-locking of a file of his. From the
profiles he posted
4 sys_re_match_2 <cycle 2> [1860]
6 scan_words <cycle 2> [1773]
22536 sys_re_search_2 <cycle 2> [162]
39182 find_end_of_comment <cycle 2> [179]
101547 scan_sexps_forward <cycle 2> [147]
14448488 re_match_2_internal <cycle 2> [8]
36.3 0.59 5.69 14611763 update_syntax_cache <cycle 2> [5]
0.04 2.78 819566/819566 Fprevious_extent_change [11]
0.05 2.67 819566/819566 Fnext_extent_change [12]
0.09 0.00 819566/819616 Fsyntax_table_p [67]
0.06 0.00 2431641/2441538 make_buffer [74]
819566 Fget_char_property <cycle 2> [64]
update_syntax_cache is called a whopping 14milion times from
re_match_2_internal. Presumably it is backtracking a regexp with
some syntax class matching in it.
update_syntax_cache basically
- Computes the Qsyntax_table property using Fget_char_property
- Estimates the boundaries of the region where this table is valid
using Fprevious_extent_change and Fnext_extent_change
(note that this is a vast underestimate, most of the time this
will be (point-min) and (point-max)).
I have got a patch that cleans up things a bit
- removes some duplication from the UPDATE_SYNTAX_CACHE macros
- Checks for the valid region in the macros avoid the function call
Some other optimizations that could be made
- Use the lower level functions directly, just avoiding needless
make_buffer calls (I have already removed most of those), etc
- Most users either deal in Byteind's (regex.c) or have them readily
available (everybody else uses BUF_FETCH_CHAR close by). So it
would avoid a lot of useless conversion if the cache boundaries
were byteind's.
However that is all just tweaking, too speed this up more fundamental
changes are needed. Any ideas?
We could:
1. make it a real cache, i.e. keep an array of (consecutive) positions
and their syntax table property values.
2. Look into speeding up the extent manipulation routines, maybe
providing more specialized version for the syntax cache, caching some
the soe data etc, same as is done for redisplay.
3. IIRC correctly syntax property extents cannot overlap, therefore
the boundaries of the extent where we found the current property
value are good values for the cache boundaries and much more likely to
be bigger (possibly even the whole buffer). It would be a pity because
it would make this not a real extent-property.
4. Maybe it is worth more aggressively searching for the real
boundary with a specialized extent walker. That could be slightly
slower but if it could reduce the 900000 calls to the update routines
to just a few by making the cache work better it would pay off
quickly.
Note that a simple
(defun walk-extents ()
(interactive)
(let ((pos (point))
(no 0))
(while (not (eq pos (point-max)))
(setq pos (next-extent-change pos))
(incf no))
(message "%s" no)))
Walks TraverseSchema.java's 22000 extents in just 5ms!
Jan