[QUERY] Support Unicode escapes in the Lisp reader, à la Java

Monday, 17 April 2006

        QUERY

...
>>>> "Aidan" == Aidan Kehoe
<kehoea(a)parhasard.net&gt; writes: 
2006-04-16  Aidan Kehoe  <kehoea(a)parhasard.net&gt;

	* lispref/objects.texi (Character Type):
	Describe support for ?\u and ?\U as character escapes allowing you
	to specify the Unicode code point of a character. 

What is the purpose of having two syntaxes?  I would prefer a single
syntax, ?\U<HEXDIGIT>+, with an error being signaled if the
"character" can't be represented in that XEmacs.

Also, I think permitting elision of leading zeros is false
convenience.  You rarely see TWO-digit octal constants, even in
syntaxes where they are permitted.  I think the same will happen here;
ie, you'll typically see ?\U000A for linefeed, if people are going to
use that syntax at all.  We should use the same syntax in strings as
we do in characters, and in that case requiring four hexdigits will be
easier to read and to write code for.  (I guess in that case it makes
sense to have two syntaxes, since we do need to provide for Planes
1-16, but writing 8 digits 99% of the time would be unbearable.)

I question the need for this at the present time, as code using this
escape would necessarily be incompatible with 21.4.  I would prefer
introducing these syntaxes for character constants when we convert the
Lisp library source encoding to Unicode.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

[QUERY] Support Unicode escapes in the Lisp reader, à la Java