Henry S. Thompson writes:
Kyle Jones <kyle_jones(a)wonderworks.com> writes:
> Henry S. Thompson writes:
> > I'm concerned that there's been no follow-up to either my [1] or a
> > similar [2] report of crashes resulting from stack overflows
> > underneath re_search_2. It would be nice if these could be fixed
> > before we declare 21.4 to be stable.
> >
> > Thanks
> >
> > ht
> >
> > [1]
http://list-archive.xemacs.org/xemacs-beta/200206/msg00005.html
> > [2]
http://list-archive.xemacs.org/xemacs-beta/200206/msg00191.html
>
> This looks like a stack overflow, something that is easily
> triggered by the regular expression search code and its use
> of alloca() to grab large blocks of memory off the stack. If
> you can increase your stacksize under Cygwin, try doubling it
> and see if it makes the problem go away. XEmacs could be
> made to do this for the user, or XEmacs could be compiled
> with REGEX_MALLOC by default in environments that have an
> insuffieciently large default stack size.
Indeed it is a stack overflow -- the point is that up through 21.4.3,
this stack overflow was caught and an abort raised, which is what also
happens still with 21.4.8 under e.g. Solaris. I don't mind the abort,
I don't need or want to change my stacksize, what I mind is the crash.
When you see the error "Stack overflow in regexp matcher" this
refers to a number of errors that might occur in the regex code,
one of which is the overflow of the matcher's failure stack. It
doesn't refer to the operating system controlled stack, which is
what I think is being exhaused by alloca() calls.
Your Solaris build most likely was not using alloca() for the
failure stack. If your XEmacs was compiled with the relocating
allocator enabled, then that allocator rather than alloca() is
used for the failure stack. The relocating allocator allocates
off the heap rather than the stack so you won't see a stack
overflow from its use. The relocating allocator is enabled by
default for most operating systems. But under Cygwin the
relocating allocator is disabled by default, so when you moved
from Solaris to Cygwin, you likely started using alloca() and so
your previously benign internal stack overflow became a hard OS
level stack overflow which crashed the editor.
There's no way that I know of to determine whether an alloca()
call is going to succeed or generate a SIGSEGV. So in order to
handle OS level stack overflows XEmacs will have to wrap alloca()
calls somehow, probably with a signal handler that longjmp()'s
back to some known safe place and then signals an error.
Is it really reasonable to ship something as 'stable' which
is
vulnerable to stack overflow in this way?
Arguably it is, if the problem is rare. This is not a buffer
overrun sort of problem with the attendant security
vulnerability. What happens is XEmacs asks for more stack space
than the process is allowed to use, and the kernel's response is
to send a SIGSEGV. I think XEmacs should handle it if it can,
since crashes are obviously to be avoided.