Re: [Bug: 21.4.12] mail-extr.el broken for ISO-8859-2

Wednesday, 25 June 2003

        ...
>>>> "Stephen" == Stephen J Turnbull
<stephen(a)xemacs.org&gt; writes:
>>>> "Jan" == Jan Rychter <jan(a)rychter.com&gt; writes: 
Jan> And indeed, the Emacs version of mail-extr.el seems to use
 Jan> [:alnum:] and [:alpha:]:

 Stephen> Which is the right thing to do.  But our regexp engine doesn't
 Stephen> currently support those.  It's not clear to me what the right
 Stephen> way to do that is, given the brokenness of the POSIX standard
 Stephen> with respect to multilingual text.  I guess the best bet is
 Stephen> simply to use Unicode's idea of whether something is a word
 Stephen> component or not.

 Stephen> Volunteers?  Note that the implementation will be very
 Stephen> different in
 Stephen> 21.4 (which has no good access to Unicode tables) and 21.5.

Well, it seems there were no volunteers :-( (and as I wrote, I am not
capable to fix it myself)

 Jan> Now, opinions are divided on whether
 Jan> mail-extract-address-components should really get multilingual
 Jan> text or not.  In any case, the current solution is rather broken.

 Stephen> mail-extr.el is basically a collection of hacks and kludges
 Stephen> anyway; you may as well just add the relevant characters to
 Stephen> those regexps.  That will work in 21.4 as well as 21.5.  Or
 Stephen> (as a heuristic), just use [^\000-\255] (you'd have to do that
 Stephen> as an alternative rather than a character class).  I bet that
 Stephen> will work well enough in practice.

Well, I can continue to kludge around this adding relevant characters
myself, but...

The current situation basically means that mail-extr is broken for
anyone who doesn't use ASCII + ISO-8859-1. It breaks BBDB and Supercite
for me. I'd consider it quite serious breakage.

--J.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: [Bug: 21.4.12] mail-extr.el broken for ISO-8859-2