Michael Toomim <toomim(a)cs.berkeley.edu> writes:
This is bad for me. Just because an X client loses connection to a
server on which it is hosting a frame/window, does that mean that
the client has to totally pork?
Apparently X/Xt library makes it very hard to handle that situation
sanely. When the connection to the X server gets lost, our callback
function `x_IO_error_handler' is called. According to a comment in
the source, "we should not return from this function, or Xlib might
just decide to exit()."
So what XEmacs does instead is longjump to top-level, but marks the
device as "being deleted" (i.e. unusable) beforehand. The code that
could get executed between that point and the point when the device is
actually deleted is supposed to check for DEVICE_X_BEING_DELETED().
I've added such checks to several places I thought were suspicious,
and I got the problem resolved on my setup at the time. But others
kept reporting the crashes, and I was unable to fix it.
Maybe you can figure out -- in a debugger -- exactly where and why
XEmacs dies following the loss of connection between one of its X
devices and the X server.
This is idle speculation, but I suspect the problem could be that it
is not sufficient to "enqueue a magic event" to get rid of the faulty
device. The "next event" action might read from the dead device and
die for it. Perhaps a more aggressive action needs to be taken --
say, set a variable "dead_devices_exist_p" to 1, which makes the event
loop check for the state of the devices *before* trying to read any
events from them.