>>>> "Stephen" == Stephen J Turnbull
<stephen(a)xemacs.org> writes:
>>>> "Jan" == Jan Rychter <jan(a)rychter.com> writes:
Jan> ... at which points XEmacs becomes totally unresponsive and stays
Jan> in that state forever (as far as I can tell). There is no other
Jan> network activity occurring.
Stephen> If this is XEmacs hanging, and not the system call failing to
Stephen> timeout, then I would say it's Gnus breakage (inflooping on an
Stephen> operation known to have failed), not XEmacs itself. OTOH,
Stephen> your resolver setup is apparently broken. It is supposed to
Stephen> read resolv.conf on every call AFAIK---changes to resolv.conf
Stephen> should take effect immediately. Anyway, they do for me.
Hmm, interesting. I had the impression that resolv.conf is only read at
libc initialization time, and never again.
Of course I do agree that my resolver setup is broken if resolv.conf
contains bogus nameservers.
Also, to answer Glynn Clements -- I've checked and 'ping' and
'telnet'
do not hang forever with a bogus resolv.conf. Both timeout. XEmacs is
the only application I've seen in my system that just hangs forever.
Stephen> One thing you can do is to use a caching nameserver locally.
Stephen> You don't need bind, just something small like pdnsd.
Yes, I should probably do that. I remember I killed that setup when I
was trying to get reasonable battery from my laptop. I was hunting
around for applications doing gratuitous I/O without any need to do so
(rant: it seems many programmers live in the huge-noisy-always-on
machine world, and assume that they can and should read and write to the
filesystem anytime they want), and I seem to remember that the caching
nameserver was one of the applications writing to the hard drive quite
often.
I can of course try that, but perhaps this bug (if it is indeed a bug)
is worth fixing anyway. BTW, I am not entirely sure if a caching
nameserver will solve the problem -- if I'm not connected to the
network, XEmacs might get into the same state by never timeouting.
Jan> What I don't understand is why the lookup doesn't just
Jan> fail. XEmacs shouldn't hang forever, no matter what happens.
Stephen> If Gnus gets into a tight loop, there's not much we can do
Stephen> about it. Maybe we can add some checks for QUIT so at least
Stephen> you can C-g out of it.
[...]
Hmm. What do you mean by a "tight loop"? Gnus seems to do it this way
(code snipped from inside a let* in nnpt-open-connection from nntp.el):
(condition-case ()
(let ((coding-system-for-read nntp-coding-system-for-read)
(coding-system-for-write nntp-coding-system-for-write))
(funcall nntp-open-connection-function pbuffer))
(error nil)
(quit
(message "Quit opening connection")
(nntp-kill-buffer pbuffer)
(signal 'quit nil)
nil))
where nntp-open-connection-function is:
(defun nntp-open-network-stream (buffer)
(open-network-stream "nntpd" buffer nntp-address nntp-port-number))
and the top of my backtrace was indeed:
open-network-stream-internal("nntpd" #<buffer " *server
news-server.san.rr.com nntp *nntpd**"> "news-server.san.
rr.com" "nntp" nil)
# bind (coding-system-for-read coding-system-for-write cs-r cs-w protocol service host
buffer name)
open-network-stream("nntpd" #<buffer " *server
news-server.san.rr.com
nntp *nntpd**"> "news-server.san.rr.com" "
nntp")
# bind (buffer)
nntp-open-network-stream(#<buffer " *server
news-server.san.rr.com nntp
*nntpd**">)
I have to say I don't see any loops here -- but I'm probably missing
something...
many thanks for your help,
--Jan