Pardon the stream-of-consciousness post.
-- Hrvoje Niksic <hniksic(a)arsdigita.com> spake thusly:
Matt Tucker <tuck(a)whistlingfish.net> writes:
> Could someone detail the problem for me? I think I must have missed
> the discussion leading up to this.
There was no preceding discussion, Stephen simply mentioned it, and
everyone knows what's it about anyway. OK, just kidding. :-)
The problem is pretty much described in the subject: an open
parenthesis in the first column makes XEmacs syntax analyzer think
that it's the beginning of defun. This fools the indentation
functions, font-lock, etc.
The results are disastrous; take this Tcl code:
doc_body_append "<form method=GET action=foo>
[export_form_vars foo bar baz]
Please write your name: <input type=text name=person_name>
"
Because of the square bracket that precedes `export_form_vars',
font-lock will think that here a defun (?!) begins, and the rest of
the file will be colored as a string, except for the actual strings,
which will be colored as code. Lovely to behold.
I'll leave the explanation of the internals to someone who understands
them (Jan?), which is why I asked the question anyway.
I haven't researched this fully, but I suspect this bug is in the way
font-lock.c operates. However, in addition to porting the syntax-table
code from GNU Emacs, I also ported
`font-lock-fontify-syntactically-region' to use `parse-partial-sexp'
instead of `syntactically-sectionize' (the font-lock.c method of
syntaxification).
So if I'm correct, this bug didn't exist on GNU Emacs because they
don't have font-lock.c, and this bug is now fixed because we're not
using it. I copied your sample code twice into an empty tcl-mode and
did a `font-lock-fontify-buffer', and it fontified correctly.
I also tried the following in the scratch buffer:
---
(defun emacs-lisp-byte-compile ()
"Byte compile
(the file containing the current buffer."
(interactive)
(if buffer-file-name
;; XEmacs change. Force buffer save first
(progn
(save-buffer)
(byte-compile-file buffer-file-name))
(error "The buffer must be saved in a file first.")))
---
And the '(' I stuck in the comment didn't seem to screw up
fontification. However, indenting still seems to be somewhat hosed
(`indent-sexp' works properly expect for the comment, but
`indent-for-tab-command' doesn't), and fontification doesn't happen
properly if a line is changed, although a fontify-buffer will fix it.
So the bug appears to be half-fixed, and the main problem appears to be
`beginning-of-defun', which `syntactically-sectionize' uses to set up
its caches, and which is incorrectly reporting the embedded parenthesis
as the start of defun.
For what it's worth, `beginning-of-defun' is broken on GNU Emacs, but
their fontification isn't, although indenting is broken in the same
way. I suspect the reason fontification works is that they use a
different method of refontifying which hides the problem.
I can research this a bit further and try to fix it. I suspect that
there are at least a couple solutions. One of them would be to apply
custom syntax tables to all strings to mark opening parentheses as
punctuation, but that seems overkill. It might also be possible to fix
`beginning-of-defun' to use `parse-partial-sexp' instead of just doing
a regex search.
I've also been wanting to redo lazy-lock-mode to bring it more in line
with GNU Emacs', which is more sophisticated and has more elegant
behavior. I realize there are some issues with XEmacs not having the
same hooks that GNU Emacs' lazy-lock uses, but I think those issues can
be resolved. I also think that if proper behavior can be achieved, it
might be worth rolling the lazy-lock functionality into font-lock-mode,
as long as there's no opposition to it.