Re: Plea for minimal tagbits

Thursday, 30 April 1998

        ...
>>>> "OG" == Olivier Galibert
<galibert(a)pobox.com&gt; writes: 
On Thu, Apr 30, 1998 at 01:41:39PM +0900, Stephen J. Turnbull wrote:

...
> Although extremely few users will want to use UCS-4 for itself,
> the least buggy quick path to a wide-char Mule in multilingual
> contexts uses UCS-4 in the implementation.  UCS-4 characters,
> by a strange coincidence, just barely fit into a 31-bit
> integer. 
    OG> Errr, didn't we agree that 0-FFFFFF (aka, 24 bits) was enough
    OG> ? Maximal tagbits have 24bits characters, minimal tagbits
    OG> 30bits ones (and not 31, that's only for integers).

OK, I retract that, then.  I forgot about "Ebola".  I was thinking
that we could avoid a bunch of masking operations by using 31-bit
characters.  But even if it's only one bit, we're going to have to do
it.

...
> Unifying the Han ideographs through UCS-2/Unicode is possible,
> but will surely introduce new coding-system I/O bugs. 
    OG> Sorry but I don't parse that.

The point is that putting the external coding system tag on an extent
property (as I have suggested) requires the higher-level Mule code to
test the extent property.  Since the extent is different from the
character, the access operation introduces a potential for confusion
and bugs.  Doing it efficiently suggests that functions that operate
on regions or strings should remember the relevant extents.

Current implementation of Mule looks at bufchars one-by-one as far as
I can remember.  Your UCS-4 scheme lends itself to direct integration
into the current higher level interface more so than the UCS-2 +
extents scheme I've advocated.

...
> I think there's a pretty good argument here 
    OG> I don't want to see implementation choices done solely on the
    OG> basis of things that may exist someday.

You're right.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: Plea for minimal tagbits