From: Jan Vroonhof <vroonhof(a)math.ethz.ch>
Date: 05 Nov 1998 13:50:46 +0100
User-Agent: Gnus/5.07004 (Pterodactyl Gnus v0.40) XEmacs/21.0 (Pyrenean-pre6)
First of all, I'm really sorry I can't deal with this right now, but
my IP connection has been buggy for about a week. I hope to get it
back ASAP. Please DON'T attempt to fix this in official source
without me - thanks!
I have been at the XEmacs-with-gpm problem again (with a debugger
and a -g build of libgpm this time) and I found the problem. It is a
bug in the way libgpm tries to handle relaying to the old signal
handler.
I debugged against 1.13 (that's what I have at home) but the code
did not change in 1.14.
Note that even a simple program of the type
int main(int argc,char *argv[]) {
/* setup conn struct */ Gpm_Open(conn,0);
puts("Here we go.."); killpg(0,SIGTSTP); puts("Back again");
Gpm_Close();
}
will hang when run on a Linux console (and not on an Xterm).
Alternatively, just run the 'mev' example program on a Linux
console and send it a SIGTSTP and it will also hang.
No it won't, not on my system (ix86, kernel 2.0.35, libc 5.4.44). Do
you use glibc? That may be the root of the trouble. Otherwise I'd
suspect the way you have compiled libgpm (maybe you added -D_USE_BSD?)
The problem is here in (liblow.c):
> #if (defined(SIGTSTP)) /* itz: support for SIGTSTP */
First comment. Compare this handler with the way the SIGWINCH is
relayed just above.
> /* Old SIGTSTP handler. */
> static __sighandler_t gpm_saved_suspend_hook;
> static void gpm_suspend_hook (int signum)
Note. Signal handlers installed with 'signal' are called with the
signal that generated them blocked. It is then unblocked on exiting
the signal handler.
No. Under ix86 Linux up to and including 2.0 (don't know about 2.1,
but I won't be fixing gpm to be compatible with a development kernel),
the signal syscall has AT&T Unix semantics - no masking, no queueing,
handler reset to default on delivery. The kernel code for sys_signal
in linux/kernel/signal.c adds SA_NOMASK to the flags, check yourself.
> {
> Gpm_Connect gpm_connect; sigset_t old_sigset; sigset_t
> new_sigset; int success;
>
> sigemptyset (&new_sigset); sigaddset (&new_sigset, SIGTSTP);
> sigprocmask (SIG_BLOCK, &new_sigset, &old_sigset);
Not necessary signal is already blocked.
Wrong, see above.
> /* Open a completely transparent gpm connection */
> [code snipped]
>
> /* take the default action, whatever it is (probably a stop :)
> */ sigprocmask (SIG_SETMASK, &old_sigset, 0); signal (SIGTSTP,
> gpm_saved_suspend_hook);
Reinstall old signal handler.
> kill (getpid (), SIGTSTP);
Send the signal to ourselves. The idea here is that the old signal
handler will now get called. However as the signal is currently
blocked it will get queued.
Wrong, see above.
[*] Suppose the old SIGTSTP handler was called. How are we sure we
get back here? What if the old handler longjmp's to somewhere?
That is true, it would screw us. But it would be a very unusual thing
to do, because idiomatic usage of TSTP is to prepare for the sleep
first by saving state, second by fixing terminal settings if necessary
(mostly return to cooked mode).
> } /*if*/
> }
Here the signal handler exits. The SIGTSTP handler is unblocked and
any queued signals are sent including the one we sent
outselves. Bingo! we now have a sinal loop. The process will hang
sucking up 100% of CPU.
See above.
> #endif /* SIGTSTP */
This code was obviously written for a system where signal handlers
were not called with the signal blocked. Did Linux change its
behaviour sometime in the past?
How to fix
1. Maybe explicitly unblocking the signal between reinstalling the
old handler and sending the signal works. [Last minute
addition. Yes this must work, it is the way Emacs relays
signals to their default handler].
OK, this should work even if your system's signal semantics is
different for some weird reason. I'll add it is soon as I can get
back. (Using sigaction with SA_NOMASK will have the same effect and
will make the intent a bit clearer).
2. The SIGWINCH handler just above it just calls the old signal
handler direct. That obviously works. May I suggest however
that a more complete prototype is used for the signal handler.
You'd need special cases for SIG_DFL and SIG_IGN. Ugly.
3. The signal handler is installed using a more general interface
than the BSD signal wich allows setting the handler type such
that the signal is NOT blocked by default. This however leads
to race conditions.
What's this BSD stuff?? Are we talking Linux or not? See above about
gpm compilation.
5. Alternatively one can take the dual interface to 4. There is
is
a hook that says the process will take care of signal handling
itself and libgpm will provide function gpm_before_suspend and
gpm_after_suspend that the program must call.
This defeats the purpose of dealing with SIGTSTP in libgpm at all.
To behave correctly under suspension and resumption, ANY gpm client
MUST shadow its connection in its SIGTSTP handler, or else it will
interfere with other clients that run while it's sleeping. Originally
there was no mention of SIGTSTP in libgpm, I added it there when I
noticed that I had to replicate this same code in every client. Your
proposal [5] would turn the clock back to this situation (unless
combined with one of the other solutions).
I would like a combination of 1 and 5. With 5 being low priority so
it can wait till there actually is a client that needs it.
OK, agreed, I'll do [1] ASAP.
--
Ian T Zimmerman <itz(a)rahul.net>
I came to the conclusion that what was wrong about the guillotine
was that the condemned man had no chance at all, absolutely none.
Albert Camus, _The Outsider_