Re: Failures in "test charset-in-* functions" in mule-tests.el

Sunday, 24 December 2017

 Ar an ceathrú lá is fiche de mí na Nollaig, scríobh Stephen J. Turnbull: 

...
 So after dealing with the test all characters issue[1] I noticed
that
 charsets-in-region and charsets-in-string were failing.  I suspect the
 reason is that in the Unicode build the precedence lists are not tuned
 right.  I guess a real fix would need to be some intelligent approach
 to modifying the precedence list according to the charsets used at
 read time or input (although that won't work if the input is
 Unicode).  The other problem is that "intelligent" really means
 constructing some kind of precedence graph, and it could easily be
 impossible (eg if both Chinese and Japanese were used in the same
 file, you'd need a language tag to disambiguate).

 I guess the first thing I'd try is ensuring that ISO charsets come
 before the windows-xxx and IBM CPxxx versions.

 Any other suggestions? 
We could disable the charsets-in-region tests on the Unicode builds. Or have
a defined order we expect depending on the current language environment; but
to be honest, with unicode-internal, the output of charsets-in-region isn’t
something the user is going to care about, I would lean more towards the
former.

...
 Footnotes: 
 [1]  By the way, why does the Unicode build have a 2^30 repertoire?
 ISO 10646 has a 2^31 repertoire IIRC (maybe 2^32?), but Unicode only
 has 2^21 (precisely, 2^20+2^16). 
Ben’s choice. We do need more than 2^21, for our invalid sequence
characters, but we certainly don’t need the full 2^30.

-- 
‘As I sat looking up at the Guinness ad, I could never figure out /
How your man stayed up on the surfboard after forty pints of stout’
(C. Moore)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: Failures in "test charset-in-* functions" in mule-tests.el