Starting from 21.4.3 XEmacs was sort of livelocking (process is
running, but X events just disappear into ether) on Irix 6.5.
First it was every few weeks, but recently, i.e. 21.4.8 and 21.4.9,
I've been hitting it pretty reliably several times a day, usually if
I'm doing something related to the reading from a file or a pipe.
So I've decided to dig a bit to see what's going on. The symptoms are
that all of a sudden all frames would stop responding to X events. If
there was a TTY frame, it would not respond to keyboard as well. dbx
was always showing that the process is in select, so I was blaming
select, or rather stuffed fdset, for my problems. After some more
investigation, I found that actually select works just fine (I can
intercept select from libc on Irix since it's a weak symbol), its the
XEmacs, which is doing something funny. To reproduce the problem, I
wrote a bit of lisp which calls Darrel's xcscope in a tight loop
(setq l '("Select" "nohide_traverse" "vnode_t" "exportinto" "mountinfo"))
(interactive (list(cscope-prompt-for-symbol "Find")))
(let ((ll l))
(setq b (current-buffer))
(setq symbol (car ll))
(cscope-call (format "Finding symbol: %s" symbol)
(list "-1" symbol))
(setq ll (cdr ll)))))
By evaluating that, I can get to a livelock on an X display in a
matter of minutes (sometimes seconds, depending how lucky I am). Yet
at the same time my select intercept still runs and if I move mouse
over blanked frame, click on the frame or expose it, I'm getting the
event, select returns and comes back for the next event.
Nothing like this happens on the TTY run, which suggests that there is
a memory corruption somewhere in the X layer.
There is a secondary proof of this theory of memory curruption in X -
I cannot start XEmacs on X if I compile with GNU malloc, compiling
with standard malloc would allow the X sessiont to start.
Since I know next to nothing about XEmacs core, I need guidance in
tracking it down: I can help, test, make suggestions etc in whatever
form I can.