packages with binaries

older

Summary of XEmacs Issue Tracking...

Re: [Q] winmail.dat

Julian Bradfield

Saturday, 21 January 2012 Sat, 21 Jan '12

9:28 p.m.

(permalink)

I'm trying to update the VM package, and as this is my first ever interaction with the package system, it's a bit daunting. First question: what am I supposed to do with auxiliary programs (qp-decode, base64-decode etc) that VM likes to have? Since XEmacs has built-in base64 functions, I guess the base64 programs are superfluous anyway; and perhaps qp-*code are superfluous because any qp-encoded message part is likely to be small anyway. But if I did want them, what would I do with them? Secondly, is there a way to check out the package infrastructure without getting the whole of the package sources? (This is also my first interaction with Mercurial, since these days every project uses a different VCS...) _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Show replies by date

Stephen J. Turnbull

Sunday, 22 January Sun, 22 Jan

5:51 a.m.

(permalink)

Julian Bradfield writes:

...

First question: what am I supposed to do with auxiliary programs (qp-decode, base64-decode etc) that VM likes to have?

Propose to XEmacs Beta that they be added to core. This loses, though, because 21.4 won't get them. The problem with distributing in the package is that binary packages are currently platform-independent; it's not possible to distribute binaries, and on the platform where it matters most (Windows), the user will very likely not have the right compiler (or all too often any compiler at all).

...

Since XEmacs has built-in base64 functions, I guess the base64 programs are superfluous anyway; and perhaps qp-*code are superfluous because any qp-encoded message part is likely to be small anyway.

Yes, and yes.

...

Secondly, is there a way to check out the package infrastructure without getting the whole of the package sources?

Probably not. But why try to avoid it? You need 50% or so of the packages to build VM anyway (I think "dired" is new since the version in our packages): REQUIRES = xemacs-base mail-lib gnus pgg ecrypto eterm sh-script net-utils ps-print os-utils bbdb fsf-compat dired (Gnus is the big pig here.) Note that using the package builder target in upstream VM is not acceptable; the Makefile has to use our infrastructure. Otherwise there is no guarantee that it will even build in Norbert's tree. BTW, are you really sure you want to do this? Kyle's code was idiosyncratic (to say the least), but recent VM is a bletcherous mess. :-( _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Julian Bradfield

Sunday, 22 January Sun, 22 Jan

3:18 p.m.

(permalink)

Stephen writes:

...

BTW, are you really sure you want to do this? Kyle's code was idiosyncratic (to say the least), but recent VM is a bletcherous mess. :-(

I think it's becoming less so as Uday fits bits of refactoring into the frenetic feature addition. More importantly, it's less buggy. The version currently in our repository still has mail-corrupting coding system bugs. (If you think vm is bletcherous, you should see my add-ons;-) _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Sunday, 22 January Sun, 22 Jan

4:01 p.m.

(permalink)

Julian Bradfield writes:

...

I think it's becoming less [bletcherous] as Uday fits bits of refactoring into the frenetic feature addition.

Well, I'm glad you think so, since it looks like you're gonna be on point for this foray into the jungle. I'll do what I can to help, but I can't promise all that much. _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Uday S Reddy

Wednesday, 1 February Wed, 1 Feb

1:56 a.m.

(permalink)

Stephen J. Turnbull writes:

...

BTW, are you really sure you want to do this? Kyle's code was idiosyncratic (to say the least), but recent VM is a bletcherous mess. :-(

That is a bit harsh! As a theorist of software, I always make a crucial distinction between internal structure and external functionality. VM may not have the world's most beautiful code, but it definitely wins in its functionality. Some other tools that I won't name have pretty-looking code, but their usability is, well..., bletcherous. Moreover, I think VM has a certain logical coherence which makes it quite solid. Despite the fact it is going through a second maintainer now, almost nothing we do breaks it. That is quite phenomenal if you ask me. In the last couple of years, we added IMAP support (which was already there, but sort of half-done) and external messages (which was a fundamental architectural extension), without getting into a single major issue. Nobody lost an email message. Nobody had a mail folder corrupted. VM just continues to run without batting an eye lid. This is a far cry from what we read on our competitors' newsgroups. The beauty of code isn't skin-deep. It is in the logical soundness of the architecture which can stand the stresses of the times. VM definitely has it, and certainly wins my admiration for it. Cheers, Uday _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Wednesday, 1 February Wed, 1 Feb

9:06 a.m.

(permalink)

Uday S Reddy writes:

...

Stephen J. Turnbull writes: > BTW, are you really sure you want to do this? Kyle's code was > idiosyncratic (to say the least), but recent VM is a bletcherous mess. > :-(

...

That is a bit harsh!

Not really.

...

As a theorist of software, I always make a crucial distinction between internal structure and external functionality.

I'm not talking about functionality (but since you mention it: for my taste, recent VM has added too much functionality, at least as default). I'm talking about the internal structure. No matter how much functionality you load into a program, you can't make it easier to maintain and modify. That requires refactoring. _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Julian Bradfield

Wednesday, 1 February Wed, 1 Feb

9:13 a.m.

(permalink)

Stephen writes:

...

default). I'm talking about the internal structure. No matter how much functionality you load into a program, you can't make it easier to maintain and modify. That requires refactoring.

It seems to me there is quite a bit of refactoring going on. I have mixed feelings about this: firstly, it breaks my add-ons that rely on internal structure (well, there was no documented api, was there?), secondly Uday likes Common Lisp, which is an alien idiom to me, so makes everything hard to read. _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Wednesday, 1 February Wed, 1 Feb

1 p.m.

(permalink)

Julian Bradfield writes:

...

Stephen writes: >default). I'm talking about the internal structure. No matter how >much functionality you load into a program, you can't make it easier >to maintain and modify. That requires refactoring. It seems to me there is quite a bit of refactoring going on.

That could be; I have specialized interests, maybe important stuff is happening elsewhere. The parts that give me trouble in use (threading, window management, and summary refresh) don't seem to be getting refactored. Stuff gets rewritten, but it doesn't get more understandable or (what really matters) less coupled. Features pop in and out (the various approaches to sorting by date, for example). In particular, vm-thread.el internal functions are called in several other files, and the summary seems to get fiddled with everywhere, which strikes me as wrong.

...

I have mixed feelings about this: firstly, it breaks my add-ons that rely on internal structure (well, there was no documented api, was there?),

There wasn't a documented API, but vm-vars.el and vm-message.el served something of a documentation role.

...

secondly Uday likes Common Lisp, which is an alien idiom to me, so makes everything hard to read.

*That* is all low-level implementation, which doesn't bother me at all.:-) _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Uday S Reddy

Wednesday, 1 February Wed, 1 Feb

7:59 p.m.

(permalink)

Stephen J. Turnbull writes:

...

I'm not talking about functionality (but since you mention it: for my taste, recent VM has added too much functionality, at least as default).

That is good to know. We get demands from users for new functionality, but never complaints about excess functionality. So, if you can give me some samples of what you would consider excessive in the recent VM, that would be helpful.

...

I'm talking about the internal structure. No matter how much functionality you load into a program, you can't make it easier to maintain and modify. That requires refactoring.

I think there is a more subtle point I am making. There is a distinction I can make along various dimensions, such as: internal vs external textual vs behavioral syntactic vs semantic static vs dynamic representation vs computation Traditional software engineering seems too focused on the first column in these dichotomies (what I called "skin-deep" in my earler message), whereas I think what matters more is the second column. You can see the syntactic structure as soon as you look at a piece of code. To understand its semantic structure, you need to understand its behavior and see what happens when you tweak it. Pretty looking pieces of code can fall apart at the simplest of tweaks. Having good syntactic structure doesn't mean that the code is logically coherent. VM, on the other hand, has logical coherence that continues to stand up when we tweak it. In the long run, that is what makes the code easy to maintain and modify. (Of course, nobody would complain if it had good syntactic structure too. But that is not the key criterion, as I see things now.) Cheers, Uday _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Thursday, 2 February Thu, 2 Feb

10:45 a.m.

(permalink)

Uday S Reddy writes:

...

Stephen J. Turnbull writes: > I'm not talking about functionality (but since you mention it: for my > taste, recent VM has added too much functionality, at least as > default). That is good to know. We get demands from users for new functionality, but never complaints about excess functionality. So, if you can give me some samples of what you would consider excessive in the recent VM, that would be helpful.

Attachment thumbnails in the MIME buttons and the message history buffer. The former I guess you can't avoid these days, but there should be a way to turn them off -- they're slow and buggy (the bugs may be in the external helper programs used, but PDF files are frequently misparsed as PNG for some reason, and it blows everything up). The latter doesn't seem at all useful to me, and having it pop up by default was a huge mistake.

...

You can see the syntactic structure as soon as you look at a piece of code.

Sure, but Lisp, even Common Lisp, is very syntax-poor as a language. What ends up happening is that developers end up creating their own syntax with defmacro or define-syntax depending on your preferred method of suicide-by-spaghetti. So, I'm not interested in syntactic structure of individual functions (which tends to be pretty simple anyway); I'm interested in the call graph. If the call graph is not semantic, what is?

...

To understand its semantic structure, you need to understand its behavior and see what happens when you tweak it.

I don't want to "tweak and see what happens". I want to design, and see the code do what I designed it to do. Clear structure in the surrounding code makes that a lot more likely to happen (although you're right, it's always possible to write code that will fall apart, and I've done it often enough myself even with the help of an excellent existing infrastructure to fit my code into).

...

Having good syntactic structure doesn't mean that the code is logically coherent. VM, on the other hand, has logical coherence that continues to stand up when we tweak it.

I have no idea what you mean by "logical coherence" when you refer to VM's code, except that people who work a lot on VM are able to add new features without breaking old ones. But looking at the code itself, I see things like: $ grep -l vm-th lisp/*.el lisp/vm-avirtual.el lisp/vm-delete.el lisp/vm-digest.el lisp/vm-edit.el lisp/vm-folder.el lisp/vm-macro.el lisp/vm-mark.el lisp/vm-message.el lisp/vm-motion.el lisp/vm-mouse.el lisp/vm-page.el lisp/vm-ps-print.el lisp/vm-sort.el lisp/vm-summary.el lisp/vm-thread.el lisp/vm-vars.el lisp/vm-virtual.el lisp/vm.el $ Code related to message threading is scattered all over the code base. That doesn't strike me as logical or coherent. AFAICS, the vm-thread library should get a list of messages, and use that to generate and maintain a tree structure that vm-summary can present to the user. What are the other 16 files doing in that list? I can't imagine that this is the only example of excessive coupling in the codebase, although I admit threading is what I've been working on most. _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Julian Bradfield

Thursday, 2 February Thu, 2 Feb

11:48 a.m.

(permalink)

On 2012-02-02, Stephen J. Turnbull <stephen(a)xemacs.org> wrote:

...

Attachment thumbnails in the MIME buttons and the message history buffer. The former I guess you can't avoid these days, but there should be a way to turn them off -- they're slow and buggy (the bugs

I had no idea what you were talking about, but then I found (setq vm-mime-thumbnail-max-geometry nil) in my .vm. Is that what you're looking for?

...

up). The latter doesn't seem at all useful to me, and having it pop up by default was a huge mistake.

I'd never heard of it, never seen it, and there's nothing in my .vm to disable it. So why do I not see it, and you do? (Not really sure why this is on xemacs-beta, rather than the vm list!) _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Thursday, 2 February Thu, 2 Feb

4:26 p.m.

(permalink)

Julian Bradfield writes:

...

I had no idea what you were talking about, but then I found (setq vm-mime-thumbnail-max-geometry nil) in my .vm. Is that what you're looking for?

Probably, although I was looking in the VM custom group. Silly me. :-(

...

I'd never heard of it, never seen it, and there's nothing in my .vm to disable it. So why do I not see it, and you do? (Not really sure why this is on xemacs-beta, rather than the vm list!)

The answer to both is most likely "I use 21.5 and VM trunk, and clearly nobody else does, so is it worth annoying the VM developers with this?" (Not to mention that Uday chose to reply here.) _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Julian Bradfield

Thursday, 2 February Thu, 2 Feb

4:46 p.m.

(permalink)

On 2012-02-02, Stephen J. Turnbull <turnbull(a)sk.tsukuba.ac.jp> wrote:

...

The answer to both is most likely "I use 21.5 and VM trunk, and clearly nobody else does, so is it worth annoying the VM developers with this?"

But definitely. If you don't annoy the VM developers, the annoyances will get into a release, and then the other XEmacs VM users (if there is indeed more than one other) will have to complain about them. Using VM trunk under XEmacs is courageous, in the Yes, Minister sense, I think...How often do you update? _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Friday, 3 February Fri, 3 Feb

3:44 a.m.

(permalink)

Julian Bradfield writes:

...

But definitely. If you don't annoy the VM developers, the annoyances will get into a release,

Of course. The question du jour is, "but when do I start being a PITA?"

...

and then the other XEmacs VM users (if there is indeed more than one other) will have to complain about them.

If there were any, I think I would have heard about them before. The thing is, some of the issues (motion and display -- I wouldn't be surprised if the message history buffer is one of them) I see are due to Mike Sperber screwing up window configurations[1], and other core changes, whose impact is hard to understand (partly because the XEmacs side changes are subtle, and partly ... er, well, I've ranted enough, you can see where this is going :-P). I don't want to complain, have the VM people apply a superficial fix, and then yank the rug out from under them because it's really a core issue after all.

...

Using VM trunk under XEmacs is courageous, in the Yes, Minister sense, I think...

That's a new idiom on me, although I can sort of guess what it means.

...

How often do you update?

About two or three times a month, more when I've got the time. Footnotes: [1] I'm sure he'll say VM depends on implementation-dependent internals, but there you go. :-) _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Julian Bradfield

Friday, 3 February Fri, 3 Feb

7:33 a.m.

(permalink)

On 2012-02-03, Stephen J. Turnbull <stephen(a)xemacs.org> wrote:

...

thing is, some of the issues (motion and display -- I wouldn't be surprised if the message history buffer is one of them) I see are due to Mike Sperber screwing up window configurations[1], and other core changes, whose impact is hard to understand (partly because the XEmacs side changes are subtle, and partly ... er, well, I've ranted enough, you can see where this is going :-P). I don't want to complain, have the VM people apply a superficial fix, and then yank the rug out from under them because it's really a core issue after all.

Ah. Yes. In fact, 21.5's inability to combine with VM was essentially the reason I stopped using 21.5, and therefore also the only reason I spent months unicodifying 21.4! A bit disproportionate, perhaps... Mike did post a patch here for one of the major problems five years ago, but it never got into the vm upstream source. I mentioned it in a bug track recently, but I'll check and see whether it's been included now.

...

> Using VM trunk under XEmacs is courageous, in the Yes, Minister sense, > I think... That's a new idiom on me, although I can sort of guess what it means.

"Yes, Minister" was a 1980s British comedy series. Well worth getting if you enjoy satire of the body politic. "Courageous" is the word used by the civil servant to the minister to mean "will lose you the next election". It's now an established idiom in Britain. _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Uday S Reddy

Thursday, 2 February Thu, 2 Feb

10:51 p.m.

(permalink)

On 2/2/2012 11:48 AM, Julian Bradfield wrote:

...

I'd never heard of it, never seen it, and there's nothing in my .vm to disable it. So why do I not see it, and you do?

Well, actually, I hadn't heard of it either. After Stephen mentioned it, I did a find-tag and noticed that there is a vm-message-summary.el file contributed by Kevin Rodgers in 2003 and added to the VM distribution in 2005. But it doesn't get loaded unless one `requires' it. Cheers, Uday _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Uday S Reddy

Thursday, 2 February Thu, 2 Feb

9:19 p.m.

(permalink)

Stephen J. Turnbull writes:

...

Yeah, both good examples! Some of the mess is because, when Kyle Jones became inactive, there was a bit of a free-for-all in the VM community. Rob Fenk had the attitude that the more the better. Recall that Rob had a separate distribution of VM, called "vmrf", which eventually became VM when Kyle handed control over to Rob. So, there are a lot of these "contributed" features that I wouldn't have put into VM if I had anything to do with it at the time. After I took over, I established the defining principle that "VM" means what is described in the VM manual and all the other stuff is a collection of "add-ons". The current codebase describes each file in the VM distribution as either being "part of VM" or "add-on for VM". The message-history stuff is an add-on, and you have to explicitly plug it in by adding invocations in your .emacs or .vm file, which I suppose you did! The thumbnails are a part of core VM, but I have added a way to disable them. (Set `vm-mime-thumbnail-max-geometry' to nil.) In any case, I am aware of these kind of issues and do some cleaning up as and when I get around to them.

...

I think I am using "syntactic" more broadly, as everything to do with the program text. That includes the division of the code into functions and modules etc. So, all modularity is "syntactic" in this sense. However, any information hiding that is obeyed at run-time is "semantic", irrespective of whether it is represented in the program structure. What I mean by "semantic" is the trace of all the primitive operations that get done during an execution. The call graph is close to this, but it is still dependent on the program structure because the functions that get "called" are syntactic. If the functions were badly chosen, then the call graph doesn't tell us much useful. A piece of software is good semantically if it is doing the right sequence of operations in the right order. It is modular if the information flow in the trace can be easily discerned and is natural. If the system is good in this way, we can refactor the code for better modularity, and everything will continue to work fine. But if the software had bad semantics to start with, I believe that no amount of refactoring will fix it. If you will indulge me, I can suggest some visual imagery. Think of all (the instances of) primitive operations as grains of sand. The code is the mason that assembles these grains into a nice castle that the user will enjoy using and the maintainers will enjoy tweaking. The beauty of the castle is an entirely separate issue from the beauty of the mason. If the mason is good-looking, I wouldn't complain, but if the mason is unkempt, it wouldn't bother me much either. I believe that the traditional software engineering confuses the mason for the castle. Whether it has any idea of the castle at all is questionable. (The Distributed Systems people seem to understand this though. They often talk about the trace semantics, and totally ignore the syntax. The Programming Methodology people are quite the opposite.)

...

I don't want to "tweak and see what happens". I want to design, and see the code do what I designed it to do.

I am not saying you should tweak it to see what happens. But the acid test of your design is whether it can be tweaked in the future without making it fall apart. "Seeing the code do what you designed it to do" is a short-term goal, but that is not good enough, is it? Long-lived software needs to be maintainable.

...

Clear structure in the surrounding code makes that a lot more likely to happen (although you're right, it's always possible to write code that will fall apart, and I've done it often enough myself even with the help of an excellent existing infrastructure to fit my code into).

Yes, "likely" is the operative word here. It is not guaranteed. Neither is the opposite guaranteed. In other words, even if there is no clear structure in the code, it is possible to have clear structure in the semantics. Of course, you average Joe Programmer may not be able to achieve it. But Kyle Jones could.

...

I have no idea what you mean by "logical coherence" when you refer to VM's code, except that people who work a lot on VM are able to add new features without breaking old ones. But looking at the code itself, I see things like: ... Code related to message threading is scattered all over the code base. That doesn't strike me as logical or coherent. AFAICS, the vm-thread library should get a list of messages, and use that to generate and maintain a tree structure that vm-summary can present to the user. What are the other 16 files doing in that list?

vm-thread.el builds the threads database for folder, which can then be used by all other modules. For instance, vm-motion.el uses it to go to a parent message. vm-delete.el uses it to delete a message along with its thread-subtree and so on. But the internal structure of the threads database is exclusively handled inside vm-thread.el, which is the way it should be.

...

I can't imagine that this is the only example of excessive coupling in the codebase, although I admit threading is what I've been working on most.

Oh, that is good to know. The threading code had lots of subtle bugs, which took me ages to fix. In fact, I now regard the threads database as a challenge problem for programming theorists like me. I hope you now have at least a reasonably correct code base to work with. Cheers, Uday _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Friday, 3 February Fri, 3 Feb

5:06 a.m.

(permalink)

Uday S Reddy writes:

...

as either being "part of VM" or "add-on for VM". The message-history stuff is an add-on, and you have to explicitly plug it in by adding invocations in your .emacs or .vm file, which I suppose you did!

I did have a customization of vm-message-history-max to 1 (custom wouldn't accept nil or 0, and now it doesn't recognize that variable!), but that was *after* the stupid thing started popping up and covering the whole screen. I've removed that, we'll see if it goes away now. (Anyway, a size configuration should not enable a major feature like that!)

...

The thumbnails are a part of core VM, but I have added a way to disable them. (Set `vm-mime-thumbnail-max-geometry' to nil.)

That's hardly discoverable. :-(

...

A piece of software is good semantically if it is doing the right sequence of operations in the right order.

Sure, but if you can't get a pretty good reading on that from reviewing the program source, "open source" is meaningless.

...

But if the software had bad semantics to start with, I believe that no amount of refactoring will fix it.

By definition, refactoring (and in general good structure) cannot fix bugs. However, it can expose bugs to review *before* they manifest in use, or make diagnosis and repair far easier. Malaria nets are far more effective than slapping mosquitos when they bite you!

...

If you will indulge me, I can suggest some visual imagery. Think of all (the instances of) primitive operations as grains of sand. The code is the mason that assembles these grains into a nice castle that the user will enjoy using and the maintainers will enjoy tweaking. The beauty of the castle is an entirely separate issue from the beauty of the mason. If the mason is good-looking, I wouldn't complain, but if the mason is unkempt, it wouldn't bother me much either.

I'll call your metaphor and raise you 10. Look up "Urayasu Japan liquefaction" and see some "beautiful" pictures of what happens when you let the grains of sand take care of themselves. Here's a cute one of the sewers: http://www.npr.org/2011/04/07/135181474/in-japan-shaken-soil-turned-soft-...

...

I am not saying you should tweak it to see what happens. But the acid test of your design is whether it can be tweaked in the future without making it fall apart.

Sure, but good structure and clear style can help you assess whether that is likely or not.

...

"Seeing the code do what you designed it to do" is a short-term goal, but that is not good enough, is it? Long-lived software needs to be maintainable.

Hey, that's *my* line! :-)

...

> I can't imagine that this is the only example of excessive coupling in > the codebase, although I admit threading is what I've been working on > most. Oh, that is good to know. The threading code had lots of subtle bugs, which took me ages to fix. In fact, I now regard the threads database as a challenge problem for programming theorists like me.

By the way, one reason this seems so hard is because VM treats threading as a kind of sorting, and you're tweaking that. But thread structure is non-linear: it's a tree structure, and the usual order in summaries is generated by traversing the tree in pre-order. Lisp is quite fast at doing that kind of thing -- it's not at all clear to me that functions like deleting or editing a message need to know anything about threading (just regenerate the summary from scratch, which you're most likely going to do anyway), and there is a well- known algorithm (RFC 5256) for generating the tree which is robust to occasionally-bogus data.

...

I hope you now have at least a reasonably correct code base to work with.

Yes, I do. It's called jwz-thread.el, after the algorithm's original author:-), and it implements RFC-5256-standard threading. It's less than half the size of vm-thread.el, and about two-thirds of that is documentation. It's noticably faster than vm-thread.el in building the database. Best of all, it only took about 4 hours to code, and another two to find and fix a few stupid mistakes, since Jamie did all the hard work. Making this actually work in VM is another question; I've spent about 20 hours on that, and I'm not close yet. Too much coupling.... _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Uday S Reddy

Friday, 3 February Fri, 3 Feb

10:30 a.m.

(permalink)

Stephen J. Turnbull writes:

...

> The thumbnails are a part of core VM, but I have added a way to > disable them. (Set `vm-mime-thumbnail-max-geometry' to nil.) That's hardly discoverable. :-(

Agreed. Please feel free to report such issues on viewmail-bugs or viewmail-info mailing lists!

...

Sure, but if you can't get a pretty good reading on that from reviewing the program source, "open source" is meaningless.

I think VM was designed to be just free software. "Open source" is a newer concept. I would agree that VM doesn't meet the standards of open source. Perhaps, one day it will, if we keep rewriting things. That is a good goal!

...

The trunk version of VM you are using does not treat threading as a form of sorting any more. I didn't know about RFC 5256, but I ran into Jamie Zawinski's web page on the threading algorithm and used its ideas. But, VM's algorithm was pretty similar to it already. VM predates Netscape, does it not? And, Jamie was a VM user. I don't know if he still is. The current status of threading in VM is that it is treated as a form of grouping, and all VM operations (deleting, saving, printing and what have you) can operate on threads. `vm-enable-thread-operations' enables it. So, the threads database is maintained at all times, and it is updated consistently during additions and deletions. There is also a host of "cached" information (the visible lineage of each message, the subtree rooted at each message etc.) which needs to be updated consistently. All kinds of problems cropped up in the corner cases (dealing with duplicate copies of messages, or subject line changing in the middle of a thread etc.) and each of them took me ages to fix. I came to the point of throwing away the old code and rewriting it from scratch several times. But I was hopeful that the problems were minor and I was converging on the right code. I think I did. But, still, I hope to rewrite it from scratch some day and think about why it proved to be so hard. Cheers, Uday _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Friday, 3 February Fri, 3 Feb

2:31 p.m.

(permalink)

Uday S Reddy writes:

...

> Sure, but if you can't get a pretty good reading on that from > reviewing the program source, "open source" is meaningless. I think VM was designed to be just free software. "Open source" is a newer concept. I would agree that VM doesn't meet the standards of open source. Perhaps, one day it will, if we keep rewriting things. That is a good goal!

It's probably true that Kyle didn't care if anyone else could read his code as long as he could. Nevertheless, I don't think that's good practice, and I think it's worth worrying about "superficial" stuff like syntax, nice and consistent indentation, and all the stuff that the gurus (James Martin, Michael Jackson, et al) worry about and teach CS undergrads.

...

The trunk version of VM you are using does not treat threading as a form of sorting any more.

Eh? I may be a few weeks behind, but in the version I synched to most recently the ordering of the message list is still done in the basic sort function. That just ain't right.

...

But, VM's algorithm was pretty similar to it already.

Not in several important respects, although I'll grant that the data structures are similar.

...

VM predates Netscape, does it not? And, Jamie was a VM user. I don't know if he still is.

Not since he started working for Netscape; he was their mail guru and dogfooded.

...

All kinds of problems cropped up in the corner cases (dealing with duplicate copies of messages, or subject line changing in the middle of a thread etc.)

That latter shouldn't matter. If it does, the algorithm is not like Jamie's.

...

and each of them took me ages to fix. I came to the point of throwing away the old code and rewriting it from scratch several times. But I was hopeful that the problems were minor and I was converging on the right code. I think I did. But, still, I hope to rewrite it from scratch some day and think about why it proved to be so hard.

Well, you'd better get cracking, because I need to show my version to Bill Janssen at Pycon in March (he's implementing it for the Python email module :-). _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Uday S Reddy

Friday, 3 February Fri, 3 Feb

10:41 p.m.

(permalink)

Stephen J. Turnbull writes:

...

Ok, it looks like we have agreement on that issue :-)

...

Eh? I may be a few weeks behind, but in the version I synched to most recently the ordering of the message list is still done in the basic sort function. That just ain't right.

Quoting from the current VM Manual, Sec. 10.1: "Unlike in previous versions of VM, threading is not a form of sorting. You can sort threads by the usual sort keys and the sort order will apply to at least the root messages of threads. Sorting threads by subject, for instance, can be a quick way to find threads with similar subject lines. Sorting them by date would sort them chronologically according to when the threads were initiated. Sorting them by activity is a variant of the chronological order where the dates of latest activity are given prominence instead of the dates of the initial messages." How the threads affect the sort order is of course handled in the sorting function. There are some mathematical compositions for sort-orders, such as the lexicographic combination etc. Once you plug in the right sort-order into the sorting function, it will produce threaded sort combined with various other sort keys. But building and maintaining the thread-tree is entirely separate to sorting. That is done in the vm-thread.el module.

...

> But, VM's algorithm was pretty similar to it already. Not in several important respects, although I'll grant that the data structures are similar.

I am not sure that the data structures are entirely similar, but the principles certainly are. When I first looked at VM's code, it was quite impenetrable. So, I went around googling, found Jamie's page, understood what the issues were, and then came back and read Kyle's code. It had pretty much the same ideas. The details differ. Jamie is only concerned with building the thread tree. He doesn't care to maintain it. On the other hand, in VM, we maintain it and do insertions/deletions as messages arrive or leave. For instance, Jamie says, if there is an empty container, you can "nuke it". We can't. A message that needs to go into the empty container might arrive at any time. Instead, what we do is to calculate the visible lineage of each message separately (called "thread-list") and cache it. If a new message arrives to fill a hole, then the cached info is erased and recalculated. (There is a section on threading in the "VM Internals" chapter of the VM Manual. Please see that for full details.) One of the reasons I took pains to fix the problems with the threading code is that I would like to extend it as per the modern needs: merge broken threads, detach subthreads that go off-topic, and worry about identifying duplicate copies that come with different message id's (thanks to our Mailman, gmane and gnu.org friends). I don't know if you will be able to do all these things if you use Jamie's build-and-discard approach to the threads database. I do buy the fact that we have a lot of cycles available these days and can afford to be lazy in coding.

...

> and each of them took me ages to fix. I came to the point of > throwing away the old code and rewriting it from scratch several > times. But I was hopeful that the problems were minor and I was > converging on the right code. I think I did. But, still, I hope > to rewrite it from scratch some day and think about why it proved > to be so hard. Well, you'd better get cracking, because I need to show my version to Bill Janssen at Pycon in March (he's implementing it for the Python email module :-).

Oh, if I do understand why it proved to be so hard, it will probably go into a journal or perhaps it will turn into a bigger project! Cheers, Uday _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Saturday, 4 February Sat, 4 Feb

2:13 p.m.

(permalink)

Uday S Reddy writes:

...

Stephen J. Turnbull writes:

...

Ok, it looks like we have agreement on that issue :-) > Eh? I may be a few weeks behind, but in the version I synched to most > recently the ordering of the message list is still done in the basic > sort function. That just ain't right. Quoting from the current VM Manual, Sec. 10.1: "Unlike in previous versions of VM, threading is not a form of sorting.

I agree that VM is better than Netscape 4.0 in the UI, since it has a separate toggle for threading, rather than listing it as a kind of sorting. But the implementation contradicts what the docs say:

...

How the threads affect the sort order is of course handled in the sorting function.

There's no "of course" about it. It's much simpler if you sort the sibling sets in the tree rather than trying to use thread information in the sort. It's not clear to me that it's all that much slower in practice (vm-sort is O(N log N), jwz-thread is O(N log M) where N is messages in folder and M is number of siblings per parent). In fact, it turns out that VM's sorting function (as of the version I have) special cases the thread key to insure it trumps the other other keys. So not only is the sorting function formally the way (user- visible) threads are actually computed, but this approach greatly complicates the code that actually does the work. IIRC this complexity is introduced in several places in the main sort function. By contrast, my code doesn't need a vm-sort function at all (although of course it needs all the comparator functions); it just needs Emacs's `sort', and a function that tries each of a list of comparisons in order until it gets a not equal result.

...

pretty much the same ideas. The details differ.

As far as I can see, there are a couple of ideas missing. In particular, the empty container at root concept. ISTR something else, but it was a while back.

...

Jamie is only concerned with building the thread tree. He doesn't care to maintain it.

Well, if it's fast enough (and my experiments suggest it will be), you don't need to maintain it, just rebuild it as needed.

...

I don't know if you will be able to do all these things if you use Jamie's build-and-discard approach to the threads database.

There are a number of reasons why I don't think discarding the containers is a good idea in VM. First, the thread tree is not folder-local, it's universal (ie, across users and nodes on the Internet!) Once I get the basic algorithm working, I plan to look at computing "views" per-folder from a global database for the VM session. (Maybe this isn't a great idea globally for a VM session, but virtual folders mean a per-real-folder database doesn't fly either. If you use V A and the like a lot, you're going to be building a lot of databases unless you reuse existing thread information.) Also, note that Jamie's algorithm works fine for *efficiently* adding information from new messages to the database (ie, the tree). Finally, you're quite right, that (limited!) use of operations on trees like reparenting (sub)trees, and adding or removing nodes can be very useful to help the user present the threads. One that you didn't mention here but is in the VM code is one that bugs me a lot, namely multiple instances of the same message. The database of containers can group these for you in a flexible way, the summary list can't.

...

Oh, if I do understand why it proved to be so hard, it will probably go into a journal or perhaps it will turn into a bigger project!

Message threading in principle isn't hard, because the RFC 1035/5322 definitions of References and In-Reply-To guarantee a tree if all MUAs implement those fields conformantly, and the information provided gives the appropriate links. Handling various error conditions can be a little tricky conceptually, but in practice they're rare, so don't need to be implemented efficiently. The only real issue is guaranteeing no loops in the face of arbitrary input, and Jamie's algorithm does that for you already. Even implementing the more advanced operations is going to be nearly trivial, I think, as long as sorting is subordinate to threading, not the other way around. If you want a problem that baffles me, at least, try merging the per file history trees generated by CVS into a project-wide tree a la git or bzr *without* tree metadata like branch names, but using only per-commit metadata like timestamp and commit log. _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Uday S Reddy

Sunday, 5 February Sun, 5 Feb

11:48 a.m.

(permalink)

Stephen J. Turnbull writes:

...

I agree that VM is better than Netscape 4.0 in the UI, since it has a separate toggle for threading, rather than listing it as a kind of sorting. But the implementation contradicts what the docs say:

No, I don't think so. You can turn on threading and sort the messages by various sorting orders. Unless there is a bug in the implementation, they should all work. The following bug-report provides some background as to why this was important: https://bugs.launchpad.net/vm/+bug/498323 In Kyle Jones's VM, there was one built-in sort order for threads (what we call 'activity' order these days), and that is all you got. People tried to persuade him to make it flexible but he didn't have any interest in it. I myself found activity order confusing when dealing with large threads and wished for the 'date' order. Some people prefer to use the 'reversed-date' order and presumably 'reversed-activity' order. So, these are the 4 important orders people want to use in practice. In future, we might have 'delivery-date' and its reverse, and God knows what else.

...

> How the threads affect the sort order is of course handled in the > sorting function. There's no "of course" about it. It's much simpler if you sort the sibling sets in the tree rather than trying to use thread information in the sort. It's not clear to me that it's all that much slower in practice (vm-sort is O(N log N), jwz-thread is O(N log M) where N is messages in folder and M is number of siblings per parent).

So, you would sort the roots of all threads by the requested sort order, then sort the children of each root, and then their children and so on, recursively? That was in fact the first idea that occurred to me. But then I figured out a way to incorporate everything in just one sort applied to all messages (I still do remember how people used to sort punched cards in the old days!) and it became straightforward. The trick is based on the following property: The sort order of two messages (m1, m2) in a threaded sort reduces to the sort order of their oldest different ancestors (p1, p2). That is, m1 precedes m2 iff if p1 precedes p2. If m1 and m2 belong to different threads, then their oldest different ancestors would be their thread roots. Then the property is clear. If they belong to the same thread, then you can take the thread minus its root as a message list, and the same reasoning applies within the pared down message list. So, vm-sort-compare-thread, given m1 and m2, returns the corresponding p1 and p2, and the remaining sort-keys will order p1 and p2 in an appropriate way. I think I would call this clean and elegant, rather than "complex". If I was working in a higher-order programming language, I would make its type to be vm-sort-compare-thread : sort-order -> sort-order which captures more clearly what is going on.

...

As far as I can see, there are a couple of ideas missing. In particular, the empty container at root concept. ISTR something else, but it was a while back.

I don't recall all the details either. As I said, I used Jamie's web page to understand what VM was doing. But I made no attempt to modify VM's algorithm to correspond to Jamie's. One difference I remember vividly is that Kyle Jones was keeping cycles in the thread-tree and detecting them when traversing the tree. This made things unstable. One time the cycle might get resolved one way and another time another way. So, I used Jamie's recommendation not to keep cycles in the thread tree at all, and that made things cleaner and clearer.

...

I am not sure I understand how the Internet comes into the picture. VM's thread trees are indeed folder-local. Virtual folders have their own thread trees. I see that it would be feasible to keep a single thread tree for an entire VM session. That would eliminate the issues of having to update the thread trees of virtual folders when an underlying folder changes. However, you would need to address the problem of reclaiming the thread tree nodes of the folders that have been quit, basically rolling out your own memory manager for the thread trees. So, that doesn't come for free!

...

Once I get the basic algorithm working, I plan to look at computing "views" per-folder from a global database for the VM session. (Maybe this isn't a great idea globally for a VM session, but virtual folders mean a per-real-folder database doesn't fly either. If you use V A and the like a lot, you're going to be building a lot of databases unless you reuse existing thread information.)

Indeed. But, as you observed earlier, building the thread-tree is not all that expensive. So, doing it as part of virtual folder creation is quite acceptable. What VM does avoid is the rebuilding of the thread tree when you get new mail or expunge a folder or just re-sort the messages. It would be feasible to rebuild the thread trees in those cases. I don't have strong objections to that if the thread-tree building is fast enough. One worry is that the cycles in the thread tree might get resolved differently each time, and that might confuse the user.

...

Finally, you're quite right, that (limited!) use of operations on trees like reparenting (sub)trees, and adding or removing nodes can be very useful to help the user present the threads. One that you didn't mention here but is in the VM code is one that bugs me a lot, namely multiple instances of the same message. The database of containers can group these for you in a flexible way, the summary list can't.

Yes, indeed, multiple instances are definitely a serious issue. I referred to them as "duplicate copies" of messages earlier. They did give me significant trouble because I didn't worry about them initially. But, later, I discovered that it was important to keep track of which copy among the duplicate copies is regarded as the canonical copy in the thread tree. The duplicate copies share the same message-id, but we cannot assume that they are identical. In the simplest case, one of them might have the 'deleted' attribute set and another one not. The duplicate copies have different UID's on the IMAP server. Sometimes, they even have different References headers because the mailing-list admins munge them!

...

> Oh, if I do understand why it proved to be so hard, it will > probably go into a journal or perhaps it will turn into a bigger > project! Message threading in principle isn't hard, because the RFC 1035/5322 definitions of References and In-Reply-To guarantee a tree if all MUAs implement those fields conformantly, and the information provided gives the appropriate links.

Coming up with the algorithms wasn't hard, especially, because they were there already from Kyle Jones. But smoothing out the problems and fixing the bugs was hard. I do have some idea why that was. When I wrote out the invariants for the data structure, they turned out to be longer than a screenful. So, at each point in the (internal) code, I had to know which of the invariants were true and which were not. And, that is quite a lot of intellectual labor! So, the research question for me is how to layer the data structure so that the invariants are modularized and the intellectual labor in working with the code is reduced. I am also interested in how much the existing verification tools (such as Code Contracts from Microsoft) can help in checking these invariants, and whether new verification technology would be needed. Cheers, Uday _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Stephen J. Turnbull

Sunday, 5 February Sun, 5 Feb

3:23 p.m.

(permalink)

Uday S Reddy writes:

...

So, you would sort the roots of all threads by the requested sort order, then sort the children of each root, and then their children and so on, recursively?

Nope. Recursively, yes, but depth-first.

...

I figured out a way to incorporate everything in just one sort applied to all messages (I still do remember how people used to sort punched cards in the old days!)

Sure, there was an article about it in Martin Gardner's "Mathematical Recreations" column in Scientific American in about 1967. :-)

...

and it became straightforward.

Yes and no. Sorting is indeed straightforward, except for the fact that you have to define a separate function which reorders the sort keys (specifically, to always put the thread "key" first), and you now have to call a threading function in the sort code.

...

I think I would call this clean and elegant, rather than "complex".

I guess we'll have to differ on that.

...

I am not sure I understand how the Internet comes into the picture.

It's the largest relevant referent for "universe" in this context. :-)

...

However, you would need to address the problem of reclaiming the thread tree nodes of the folders that have been quit, basically rolling out your own memory manager for the thread trees. So, that doesn't come for free!

Why do you have to reclaim anything? As I said in my previous post, the thread forest is universal, although the view VM has is missing a lot of nodes. See next point.

...

Indeed. But, as you observed earlier, building the thread-tree is not all that expensive. So, doing it as part of virtual folder creation is quite acceptable.

Except that you sacrifice information, specifically about the messages that aren't in the virtual folder. If you build from scratch, those messages' ids will correspond to empty containers, although they are available to VM in real folders. If you keep a global database, when you quit a real folder, the implementation could note on the container which folder(s) the message was found in, and fetch it when you say "go to parent" in a sparse virtual folder even if the required real folder isn't in the session any more. I think that would be useful. This information could even be kept in an external database (don't tell Jamie I said that, he'd rise up out of his coffin before sunset and throttle me! or at least call me "Wormhead" :-) so that you'd know about all messages you ever threaded.

...

One worry is that the cycles in the thread tree might get resolved differently each time, and that might confuse the user.

Do you really find cycles in the thread graph that often? (The point is not that you don't have to deal with them, of course, you do. The point is that if they're rare, user confusion will be even more so, since not all cycles would confuse the user if broken differently.)

...

When I wrote out the invariants for the data structure, they turned out to be longer than a screenful.

Hm. I have a couple of checks for invariants on the thread forest (eg, a non-root node must be a member of its parent's children, and and each of a nodes' children must have it as the parent), and on the root set (every member must have no ancestors with messages available in a real folder in the session). But that's about it. I've seen the various conditions you've been adding in comments in vm-thread.el, but they're very low-level (and in many cases I couldn't figure out why they were imposed with less than a minute's thought for each -- since there are several dozen, really understanding all that would take hours, I suspect). _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

Uday S Reddy

Sunday, 5 February Sun, 5 Feb

4:51 p.m.

(permalink)

Stephen J. Turnbull writes:

...

I do add the thread "key" first if threading is turned on, but there no reordering. In typical use of VM, threading would not be among the global sort keys. And, as you know, I don't lose sleep over calling functions. (I believe in structured programming and information hiding, but the rest of software engineering, I regard as a bunch of dogma to be taken with a large grain of salt. But, still, you have evoked my interest in coupling issues and I will continue to think about them.)

...

Why do you have to reclaim anything? As I said in my previous post, the thread forest is universal, although the view VM has is missing a lot of nodes. See next point.

In typical use of VM, we are using a bunch of current folders, and occasionally we need to go search for something in the archived mail. If the thread database grows as soon as you load archived folders, never to be reclaimed, then that might degrade performance. As you might have seen, memory leakage is a serious concern in the FSF Emacs world right now.

...

Yes, that seems like a useful extension. I look forward to playing with it when you are done!

...

I don't know how common they are either. Here is an example from my test suite (debbugs.gnu.org being the culript):

Uday S Reddy

Monday, 6 February Mon, 6 Feb

9:42 a.m.

(permalink)

Uday S Reddy writes:

...

Another thing I might add. A solution for fixing broken threads is actually to make more messages to be duplicates of each other. Mailing lists, newsgroups and bug trackers etc. change the message-id's of messages discarding the original message-id's. The threading code would need to recognize the duplicates that arise in this way and identify messages even if their message-id's differ. Keeping trace of the canonical message in each container will prove to be even more important with this than it has been! Cheers, Uday _______________________________________________ XEmacs-Beta mailing list XEmacs-Beta(a)xemacs.org http://lists.xemacs.org/mailman/listinfo/xemacs-beta

4947

days inactive

4959

days old

xemacs-beta@xemacs.org

Manage subscription

25 comments

5 participants

tags (0)

participants (5)

Julian Bradfield
Stephen J. Turnbull
Stephen J. Turnbull
Uday S Reddy
Uday S Reddy

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

packages with binaries