Simon Josefsson <jas(a)extundo.com> writes:
RFC (2)822 articles should not contain iso-8859-15 characters.
Perhaps mail-extr is supposed to operate on raw articles, not MIME
decoded ones?
Ah, you're probably right. Now I'm not sure what's supposed to be
happening, since BBDB gets passed the header after it's been decoded,
but it uses mail-extract-address-components (mail-extr.el) or
rfc822-addresses (rfc822.el) [both part of mail-lib]. So, where's the
bug? There's code in there already to handle latin-1 chars:
(let* ((latin1-ss (string (make-char 'latin-iso8859-1 223)))
(latin9-ss (string (make-char 'latin-iso8859-15 1759)))
(latin1-addr (concat "Joe Te" latin1-ss "t
<joe.test(a)foo.org>"))
(latin9-addr (concat "Joe Te" latin9-ss "t
<joe.test(a)foo.org>")))
(concat "Works: <" (car (mail-extract-address-components latin1-addr))
">, Broken: <" (car (mail-extract-address-components latin9-addr))
">"))
=> "Works: <Joe Teßt>, Broken: <Joe Te>"
mail-extr.el is in need of FSF syncing. Perhaps that would the
first step? It is a large task undertaking though.
Indeed. Since there isn't support (athough there is mention of it in
regex.h) for POSIX char classes, this will be even more work. Adding
support for them first would be a good thing, imho. Or perhaps we
could rely on the fact that syntax tables are defined for each part of
the address during parsing and just use \sw as the match character?
I'm not sure.
--
Josh Huber