Re: proposed Eistring interface

Sunday, 23 April 2000

        thanks for the url's -- any other interesting ones?

Bill Tutt wrote:

...
 > From: Ben Wing [mailto:ben＠666.com]
 >
 > I wrote this last night:
 >
 >
 > NOTE: One possible default internal representation that was compatible
 > with UTF16 but allowed all possible chars in UCS4 would be to take an
 > unused range of 2048 chars (not from the private area because
 > Microsoft
 > actually uses up most or all of it with EUDC chars).  Let's
 > say we picked
 > 4000 - 47FF.  Then, we'd have:
 >
 > 0000 - FFFF    Simple chars
 >
 > D[8-B]xx D[C-F]xx  Surrogate char, represents 1M chars
 >
 > 4[0-7]xx D[C-F]xx D[C-F]xx   Surrogate char, represents 2G chars
 >
 > This is exactly the same number of chars as UCS-4 handles,
 > and it follows the
 > same property as UTF8 and Mule-internal:
 >
 > 1. There are two disjoint groupings of units, one
 > representing leading units
 >    and one representing non-leading units.
 > 2. Given a leading unit, you immediately know how many units
 > follow to make
 >    up a valid char, irrespective of any other context.
 >
 >

 There isn't a 2048 large empty block in the BMP atm.
 See http://anubis.dkuug.dk/jtc1/sc2/wg2/docs/n2213.pdf
 (dated 2000-03-28)

 The biggest open block I noticed is U+0000A500-U+0000ABFF.
 The next biggest open block looks like U+00010900-U+00010FFF.
 After that its U+00011200 - U+00011FFF. Both of which are in Plane 1.
 Plane 1 Roadmap: http://anubis.dkuug.dk/jtc1/sc2/wg2/docs/n2214.pdf

 By open I mean that there isn't even a subbmitted proposal about what should
 actually be encoded there.

 Bill 
--
Ben

In order to save my hands, I am cutting back on my mail.  I also write
as succinctly as possible -- please don't be offended.  If you send me
mail, you _will_ get a response, but please be patient, especially for
XEmacs-related mail.  If you need an immediate response and it is not
apparent in your message, please say so.  Thanks for your understanding.

See also http://www.666.com/ben/typing.html.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: proposed Eistring interface