Martin Buchholz <martin(a)xemacs.org> writes:
> >>>>> "Hrvoje" == Hrvoje Niksic <hniksic(a)srce.hr> writes:
>
> Hrvoje> Martin Buchholz <martin(a)xemacs.org> writes:
> >> For best optimization, you would store the arglist directly inside the
> >> opaque object with the instructions for better locality of reference.
> >> Same with the constant vector.
>
> Hrvoje> Maybe. Maybe that's not really needed. For startes, I
> Hrvoje> would to see if the parameters looping (via LIST_LOOP_3)
> Hrvoje> takes a measurable amount of time for the average funcall
> Hrvoje> case. If yes, *any* optimization will do good. If not, we
> Hrvoje> needn't bother with locality.
>
> LIST_LOOP_3 itself is pretty cheap.
Hmm, yes. It bothers my esthetic soul to know that every time a
function is called its arglist is parsed in vain search of
`&optional', `&rest', and stuff. It somehow doesn't "feel right".
Obviously, the yearnings of my soul needn't have anything to with the
actual speed of the code. :-)
> Hrvoje> How much of the rest of what this URL describes have you
> Hrvoje> implemented? The specbind() stuff in the source looks
> Hrvoje> pretty much like the stuff Ben says we should have. The
> Hrvoje> same goes for the massaged byte-code stuff, as well as for
> Hrvoje> inlined unbind_to().
>
> The QUIT macro is no longer called.
> The arglist is only checked for proper-list-ness once.
> The other optimizations are not implemented.
Looking at the code, I cannot believe that the specbind() optimization
is not implemented. Also, I cannot believe that unbind_to()
optimization is not implemented.
> Perhaps, Hrvoje, you'd like to?
Why not?
> Keep in mind that the average function has one or two arguments.
> There is a danger that introducing too much machinery for argument
> parsing will actually be counter-productive. For example, we could
> store an arglist-parser function pointer in the compiled-function
> object, but the function call overhead is likely to be as high as
> parsing the arglist in the first place.
Not if the arglist is parsed only once, perhaps in
optimize_compiled_function(). All the other times, we have one
Lisp_Object pointer and two or three integers, and just use them. We
can even inline the call to Flist(), or whatever.
> Another win from reorganizing the arglist is simply saving on cons
> cells. (...)
Another good point, yes.
> Hrvoje> BTW, how do you obtain gprof output for a single benchmark?
>
> You build with -pg, run temacs, not xemacs, and run the benchmark in a
> loop a bazillion times.
Ugly ugly ugly... WIBN if gprof supported something like
gprof_start_recording() and gprof_stop_recording(), analogous to what
quantify has?
On Linux, where the source to gprof and libc are available, this might
even be implementable! Yay!