Japanese EUDC Ranges
The following table shows the mapping between the Japanese Microsoft Standard
Character Set (ShiftJIS) and Unicode. Shift JIS
Unicode
F040
U+E000
F041
U+E001
F042
U+E002
:
:
F9FA
U+E755
F9FB
U+E756
F9FC
U+E757
Chinese EUDC Ranges
The following table shows the mapping between the Chinese character sets (BIG-5
and GB 2312-80) and Unicode. BIG-5
Unicode
FA40 - FEFE
U+E000 - U+E310
8E40 - A0FE
U+E311 - U+EEB7
8140 - 8DFE
U+EEB8 - U+F6B0
C6A1 - C8FE
U+F6B1 - U+F8FF
GB 2312-80
Unicode
F8A1 - FEFE
U+E000 - U+E29F
AAA1 - AFFE
U+E2A0 - U+E4DF
Korean EUDC Ranges
The following table shows the mapping from Korean EUDC characters (Unified
Hangeul Code) and Unicode. Unified Hangeul
Unicode
C9A1 - C9FE
U+E000 - U+E05D
FEA1 - FEFE
U+E05E - U+E0BB
Bill Tutt wrote:
> From: Ben Wing [mailto:ben@666.com]
> NOTE: One possible default internal representation that was compatible
> with UTF16 but allowed all possible chars in UCS4 would be to take an
> unused range of 2048 chars (not from the private area because
> Microsoft
> actually uses up most or all of it with EUDC chars).
Aren't the extra code points you want access to user defined characters
anyway?
I could try some prodding to see if the folks here at MS would be willing to
release the process they use to map DBCS EUDC into Unicode private area
space based on the codeset if you want.
Bill
--
Ben
In order to save my hands, I am cutting back on my mail. I also write
as succinctly as possible -- please don't be offended. If you send me
mail, you _will_ get a response, but please be patient, especially for
XEmacs-related mail. If you need an immediate response and it is not
apparent in your message, please say so. Thanks for your understanding.
See also
http://www.666.com/ben/typing.html.