On Sat, 10 Jul 1999, Darryl Okahata wrote:
Justin Vallon <vallon(a)mindspring.com> wrote:
> HPUX 10.20. Are you sure that the regexp engine is running out of process
> stack, and not running out of some sort of space in a 'RegExpState
> stack[256]' variable?
With HP-UX, I always increase the regexp stack size (from 2000 to
20000). VM gets regexp stack overflows otherwise. Sample patch
attached (relative to 21.1.4).
[ Note: I've also built my kernel to bump up the HP-UX max process
stack size from 2MB to 16MB. I'm not sure this is necessary if you
use this patch, but the comment above the variable implies that this
might be. ]
Short version:
Does this happen on other platforms? VM or compile would be likely
candidates for large regexps or large patterns.
I think the (regexp internal fail_stack) stack size can be increased on
all platforms from 2000 to 20000 at the cost of 16000*4 ~= 64k.
Long version:
Well, my stack ulimit shows 80Mb, so that's good.
I did some light reading of regex.c. Here is the full context of the
patch:
[21.1.2/src/regex.c:1112]
/* Roughly the maximum number of failure points on the stack. Would be
exactly that if always used MAX_FAILURE_SPACE each time we failed.
This is a variable only so users of regex can assign to it; we never
change it ourselves. */
#if defined (MATCH_MAY_ALLOCATE)
/* 4400 was enough to cause a crash on Alpha OSF/1,
whose default stack limit is 2mb. */
int re_max_failures = 20000;
#else
int re_max_failures = 2000;
#endif
That would lead me to believe that MATCH_MAY_ALLOCATE is not defined
(though, I don't understand why, re line 1095), which agrees with line
1072 saying that (X)Emacs #undefs MATCH_MAY_ALLOCATE (something about
receiving input in a signal handler, that could then call regexp's match,
which should not malloc).
Also:
[:1795]
#ifndef MATCH_MAY_ALLOCATE
/* If we cannot allocate large objects within re_match_2_internal,
we make the fail stack and register vectors global.
The fail stack, we grow to the maximum size when a regexp
is compiled.
The register vectors, we adjust in size each time we
compile a regexp, according to the number of registers it needs. */
Things may be different for different platforms. It appears that regexp
can use alloca as well, if certain #defines are provided. It's unclear
here whether this is ever the case for XEmacs.
Also, I believe that it is the internal regexp stack (fail_stack) that is
overflowing. The fail_stack seems to record backtracking points in case
further matching fails.
So, the patch is basically: If !defined(MATCH_MAY_ALLOCATE) && __hpux,
increase the amount of space regexp_compile allocates up-front for the
worst case (max allowed) stack size (via malloc not alloca). Discounting
config.h features, every platform that hits this line would probably
benefit from increasing the stack size.
Each stack element takes: sizeof(fail_stack_elt_t) == sizeof(union
{uchar*, int}) == 4.
Increasing the stack from 2000 to 20000 would take an additional ~16k*4 =
64k.
Interesting part of the original patch from Darryl Okahata
<darrylo(a)sr.hp.com>:
#else
+ # if defined(__hpux)
+ int re_max_failures = 20000;
+ # else
int re_max_failures = 2000;
+ # endif
#endif
-Justin
vallon(a)mindspring.com