Re: Speed difference in Gnus

Tuesday, 18 October 2005

        ...
>>>> "Ben" == Ben Wing <ben(a)xemacs.org&gt;
writes: 
    Ben> Stephen J. Turnbull wrote:

...
> It's really ugly to have a buffer property that's
actually a
> text property just because performance sucks when an extent
> covers the whole buffer. 
I assume the above is what you're talking about?

    Ben> what do you mean?  as i said before, i have mostly finished
    Ben> code that will make text properties efficient by implementing
    Ben> them entirely in non-overlapping intervals, as fsf does.

What I mean is that there are a number of things that we do as buffer
attributes or buffer-local variables (language environment, coded
character set, major mode for starters) that conceptually are text
properties.  Think about it: wouldn't it be cool if instead of needing
a hack like mmm, we simply attached LISP-major-mode to an extent
covering the buffer, string-major-mode to strings, and
comment-major-mode to comments?  Wouldn't it be cool if we could
attach xml-mode to individual strings in xml.el or use html-mode for
strings by default in blog-mode.el?

    Ben> in general this is a very tough problem, and using simpler
    Ben> solutions when possible (e.g. non-overlapping text
    Ben> properties).

Whatever it is it has to be part of the implementation, not a
constraint on user code.  (I'm pretty sure you agree, but it's not
clear to me from that sentence.)

    Ben> it might be possible to simulate extents in general using
    Ben> non-overlapping intervals, where multiple intervals could
    Ben> represent a single extent and each interval contains parts of
    Ben> all extents crossing over them.  this interval would be
    Ben> guaranteed efficient and not too hard to implement, i think
    Ben> -- in fact it's the same algorithm i'd be using for
    Ben> text-properties, and could be extended to extents in general.

Something like this?  A linked list P of (buffer position, extent
list) pairs, with an intervals defined as [P[n].pos,P[n+1].pos) for
each n.  An extent is a pair of pointers into P defining which
fragments are covered by the extent plus a property list.  P[n].ext is
the list of extents covering that interval.  Then for random access,
finding n is O(lg n), and finding the property is O(|P[n].ext|), which
is the best you can get, I guess.  (For many operations, in particular
redisplay, which walk P in order, finding n is O(1), right?)  P itself
only needs to be updated when text is inserted/deleted in the buffer,
and when an extent is created/deleted.  |P| <= |# boundary points of
all extents|, so that is no worse than what we currently have, and we
never have to update the extents themselves.

    Ben> but first we need some more profiling data.

Why?  Basically in your explanation of how the soe works, you are
saying that large extents (not just extents covering the whole buffer,
but any extent that overlaps many other extents) are very expensive
because they make _all_ the overlapped extents inefficient.  So that
means that code like

(let ((header (make-extent (message-beginning) (header-end)))
      (body (make-extent (header-end) (message-end))))
  (font-lock-fontify-buffer))

necessarily sucks.  If I'm not totally missing the point, I have to
ask: Do you consider that restriction acceptable?

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: Speed difference in Gnus