On Thu, Apr 30, 1998 at 01:41:39PM +0900, Stephen J. Turnbull wrote:
>>>>> "Hrvoje" == Hrvoje Niksic
<hniksic(a)srce.hr> writes:
Hrvoje> 31-bit integers are nice, but they are hardly that
Hrvoje> important to a normal user.
But when Olivier gets his UCS-4 text manipulation stuff working....
Although extremely few users will want to use UCS-4 for itself, the
least buggy quick path to a wide-char Mule in multilingual contexts
uses UCS-4 in the implementation. UCS-4 characters, by a strange
coincidence, just barely fit into a 31-bit integer.
Errr, didn't we agree that 0-FFFFFF (aka, 24 bits) was enough ?
Maximal tagbits have 24bits characters, minimal tagbits 30bits ones
(and not 31, that's only for integers).
The 6 added bits may be useful later for ucs-2000 though. I don't know
(nor they seem to know themselves :-) how far they want to go.
Unifying the Han ideographs through UCS-2/Unicode is possible, but
will surely introduce new coding-system I/O bugs. The UCS-4 approach
will allow us to change the current low-level buffer implementation
with minimal impact on higher-level Mule code by preserving the
coding-system information at the character level as the current Mule
implementation does.
Sorry but I don't parse that.
I think there's a pretty good argument here for pushing at least
for
Mule builds to get minimal tagbits by default, maybe not for 21.x, but
soon.
I'm not sure about that. Remember that what I'm writing is still
vaporware and should be considered as such as long as I don't give out
patches. I don't want to see implementation choices done solely on the
basis of things that may exist someday.
OG.