Ben Wing <ben(a)xemacs.org> writes:
Daniel Pittman wrote:
>Ben Wing <ben(a)xemacs.org> writes:
>>Daniel Pittman wrote:
>>>This happened while running Gnus, and looks like it is a
logic error in
>>>the handling of the timeout list - the code in Fdisable_timout tries to
>>>synchronously free the Lisp object that was removed, presumably because
>>>it is an internal object. (This happens in event_stream_disable_wakeup.)
>>>
>>>This isn't terribly compatible with the finalise method of the image
>>>instance, though, which is most likely called during GC.
[...]
>I will test again with the new allocator shortly, to see if this
>resolves another issue I have run into there where the page finalize
>routine would fail because of pre-freed items, or would walk off into
>random numbers...
[...]
you should fix this so that it passes correct types. here
you're
mixing ints and void *'s; i'm surprised it compiles at all. if you
could, please do something like this:
It was my cheap, two minute test patch on x86, where that sort of nasty
trick does work.
-- change disable_glyph_animated_timeout to take a void *, like a
post_gc_action handler.
-- cast the timeout value to (void *) when calling
register_post_gc_action().
-- also fix up glyphs-msw.c and glyphs-gtk.c in the same fashion (don't
worry if you can't test)
-- put in a comment somewhere, e.g.,disable_glyph_animated_timeout()
explaining that it can't be called in a finalize method, for the reasons
you already gave.
-- when you resubmit the patch, flag it as [RECOMMEND 21.4] since 21.4
has the same problem.
I will see what I can do about that. I had hoped to work up a valid
patch, but work has kept me too busy over the last few days. I will see
how I go in the near future -- and it will be something decent. :)
the page finalize routine would fail because of pre-freed items, or
would walk off into random numbers...
OK: In testing on the old allocator/GC I found this bug, which was
caught by an assertion in the allocator that we were not in GC while
freeing memory.
With the new GC/allocator I was getting random crashes. These were
typically in the finalize phase, and were caused by the finalize method
pointing to somewhere random in memory -- outside the code space
entirely.
These happened while using Gnus, with roughly the same pattern as the
crash I found.
On one occasion the core had hit an "already freed" assertion, and on
one occasion the crash happened in the same way, but in the sweep phase
of GC.
In every case, the fault was caused by the GC routines jumping into a
random location.
I wondered if that was caused by the same bug, but causing a deeper
corruption.
As it turns out, using my cheap fix didn't correct the problem under the
new GC, and when I tracked it down a couple of times it was:
* using incremental GC
* in the finalize phase
* freeing what looked like a buffer local symbol
* jumping to random memory because the finalize method was junk
The other fields of the buffer local symbol were invalid, including the
name, so I suspect it was already freed; perhaps someone who
understands it can instrument it to do more rigorous testing on
free/non-free?
The random location, for what it is worth, didn't look like the memory
poisoning that I believe the new allocator/GC do. :l
I submitted another bug report on this, but I have not seen it yet;
something seems to mislike my bug reports on the
xemacs.org server.
The build details are the same as my report on the timeout issue, save
that the incremental GC and new allocator are enable, so you can assume
the details are the same. No other system changes in the time.
Daniel