I wrote the below because I am encountering the non-ASCII decimal digits in my
other work, and the CL function provides a useful way to make them available
to my code. (The non-ASCII decimal digits are actually in a separate patch.) I
didn’t go straight ahead and commit it because, well, not everyone likes the
CL spec as much as I do, maybe I’m going overboard.
That said, I notice GNU now has a cl-parse-integer, written to the CL
definition. And our code is full of (string-to-number (match-string 1 string)
16), consing where that shouldn’t be necessary.
# HG changeset patch
# User Aidan Kehoe <kehoea(a)parhasard.net>
# Date 1424864832 0
# Wed Feb 25 11:47:12 2015 +0000
# Node ID 750fab17b299ae7f163b2c3b3f6a954d9f0f1320
# Parent 83e5c3cd6be65c951cba14988d62752977104a30
Make #'parse-integer Lisp-visible, extend it, allowing non-ASCII digits.
src/ChangeLog addition:
2015-02-25 Aidan Kehoe <kehoea(a)parhasard.net>
* lread.c (read_atom): Use the new calling convention for
parse_integer().
* lisp.h: Change the declaration of parse_integer ().
* number.h (bignum_set_emacs_int, make_bignum_emacs_uint):
New #defines, used in data.c.
* lread.c (read_integer): Ditto.
* lread.c (read1): Ditto.
* data.c (find_highest_value): New.
* data.c (fill_ichar_array): New.
* data.c (build_fixnum_to_char_map): New.
* data.c (Fset_digit_fixnum_map): New.
* data.c (Fdigit_char_p): Moved from cl-extra.el.
* data.c (Fdigit_char): Moved from cl-extra.el.
* data.c (parse_integer): Moved from lread.c.
* data.c (Fparse_integer): Made available to Lisp.
* data.c (syms_of_data): Make the new subrs available.
* data.c (vars_of_data): Make the new vars available.
Expose parse_integer to Lisp, make it follow the Common Lisp API
(with some extensions, to allow us to support non ASCII digit
characters).
lisp/ChangeLog addition:
2015-02-25 Aidan Kehoe <kehoea(a)parhasard.net>
* cl-extra.el (digit-char-p): Moved to data.c.
* cl-extra.el (digit-char): Moved to data.c.
tests/ChangeLog addition:
2015-02-25 Aidan Kehoe <kehoea(a)parhasard.net>
* automated/lisp-tests.el:
parse_integer(), used in #'read, now signals invalid-argument
rather than invalid-read-syntax, check for that.
* automated/lisp-tests.el:
Check #'parse-integer now it's available to Lisp, check
#'digit-char, #'digit-char-p and the congruence in behaviour,
check the XEmacs-specific RADIX-TABLE argument behaviour.
diff -r 83e5c3cd6be6 -r 750fab17b299 lisp/ChangeLog
--- a/lisp/ChangeLog Sat Jan 10 19:43:28 2015 +0900
+++ b/lisp/ChangeLog Wed Feb 25 11:47:12 2015 +0000
@@ -1,3 +1,8 @@
+2015-02-25 Aidan Kehoe <kehoea(a)parhasard.net>
+
+ * cl-extra.el (digit-char-p): Moved to data.c.
+ * cl-extra.el (digit-char): Moved to data.c.
+
2014-12-31 Michael Sperber <mike(a)xemacs.org>
* simple.el (line-move): Add `noerror' optional argument, as in
diff -r 83e5c3cd6be6 -r 750fab17b299 lisp/cl-extra.el
--- a/lisp/cl-extra.el Sat Jan 10 19:43:28 2015 +0900
+++ b/lisp/cl-extra.el Wed Feb 25 11:47:12 2015 +0000
@@ -746,32 +746,6 @@
(char>= . "Return t if the character arguments are monotonically \
nonincreasing.")))
-(defun* digit-char-p (character &optional (radix 10))
- "Return non-nil if CHARACTER represents a digit in base RADIX.
-
-RADIX defaults to ten. The actual non-nil value returned is the integer
-value of the character in base RADIX."
- (check-type character character)
- (check-type radix integer)
- (if (<= radix 10)
- (and (<= ?0 character (+ ?0 radix -1)) (- character ?0))
- (or (and (<= ?0 character ?9) (- character ?0))
- (and (<= ?a character (+ ?a (setq radix (- radix 11))))
- (+ character (- 10 ?a)))
- (and (<= ?A character (+ ?A radix))
- (+ character (- 10 ?A))))))
-
-(defun* digit-char (weight &optional (radix 10))
- "Return a character representing the integer WEIGHT in base RADIX.
-
-RADIX defaults to ten. If no such character exists, return nil."
- (check-type weight integer)
- (check-type radix integer)
- (and (natnump weight) (< weight radix)
- (if (< weight 10)
- (int-char (+ ?0 weight))
- (int-char (+ ?A (- weight 10))))))
-
(defun alpha-char-p (character)
"Return t if CHARACTER is alphabetic, in some alphabet.
diff -r 83e5c3cd6be6 -r 750fab17b299 src/ChangeLog
--- a/src/ChangeLog Sat Jan 10 19:43:28 2015 +0900
+++ b/src/ChangeLog Wed Feb 25 11:47:12 2015 +0000
@@ -1,3 +1,26 @@
+2015-02-25 Aidan Kehoe <kehoea(a)parhasard.net>
+
+ * lread.c (read_atom): Use the new calling convention for
+ parse_integer().
+ * lisp.h: Change the declaration of parse_integer ().
+ * number.h (bignum_set_emacs_int, make_bignum_emacs_uint):
+ New #defines, used in data.c.
+ * lread.c (read_integer): Ditto.
+ * lread.c (read1): Ditto.
+ * data.c (find_highest_value): New.
+ * data.c (fill_ichar_array): New.
+ * data.c (build_fixnum_to_char_map): New.
+ * data.c (Fset_digit_fixnum_map): New.
+ * data.c (Fdigit_char_p): Moved from cl-extra.el.
+ * data.c (Fdigit_char): Moved from cl-extra.el.
+ * data.c (parse_integer): Moved from lread.c.
+ * data.c (Fparse_integer): Made available to Lisp.
+ * data.c (syms_of_data): Make the new subrs available.
+ * data.c (vars_of_data): Make the new vars available.
+ Expose parse_integer to Lisp, make it follow the Common Lisp API
+ (with some extensions, to allow us to support non ASCII digit
+ characters).
+
2015-01-08 Stephen J. Turnbull <stephen(a)xemacs.org>
Fix progress bar crashes.
diff -r 83e5c3cd6be6 -r 750fab17b299 src/data.c
--- a/src/data.c Sat Jan 10 19:43:28 2015 +0900
+++ b/src/data.c Wed Feb 25 11:47:12 2015 +0000
@@ -31,6 +31,7 @@
#include "gc.h"
#include "syssignal.h"
#include "sysfloat.h"
+#include "syntax.h"
Lisp_Object Qnil, Qt, Qlambda, Qunbound;
Lisp_Object Qerror_conditions, Qerror_message;
@@ -65,6 +66,9 @@
Lisp_Object Qerror_lacks_explanatory_string;
Lisp_Object Qfloatp;
+Lisp_Object Q_junk_allowed, Q_radix, Q_radix_table;
+
+Lisp_Object Vdigit_fixnum_map, Vfixnum_to_char_map;
Fixnum Vmost_negative_fixnum, Vmost_positive_fixnum;
@@ -1432,8 +1436,583 @@
}
#endif /* HAVE_BIGNUM */
}
+
+static int
+find_highest_value (struct chartab_range * range, Lisp_Object UNUSED (table),
+ Lisp_Object val, void *extra_arg)
+{
+ Lisp_Object *highest_pointer = (Lisp_Object *) extra_arg;
+ Lisp_Object max_seen = *highest_pointer;
+
+ CHECK_FIXNUM (val);
+ if (range->type != CHARTAB_RANGE_CHAR)
+ {
+ invalid_argument ("Not an appropriate char table range", Qunbound);
+ }
+
+ if (XFIXNUM (max_seen) < XFIXNUM (val))
+ {
+ *highest_pointer = val;
+ }
+
+ return 0;
+}
+
+static int
+fill_ichar_array (struct chartab_range *range, Lisp_Object UNUSED (table),
+ Lisp_Object val, void *extra_arg)
+{
+ Ichar *cctable = (Ichar *) extra_arg;
+ EMACS_INT valint = XFIXNUM (val);
+
+ /* Save the value if it hasn't been seen yet. */
+ if (-1 == cctable[valint])
+ {
+ cctable[valint] = range->ch;
+ }
+ else
+ {
+ /* Otherwise, save it if the existing value is not uppercase, and this
+ one is. Use the standard case table rather than any buffer-specific
+ one because a) this can be called early before current_buffer is
+ available and b) it's better to have these independent of particular
+ buffer case tables. */
+ if (current_buffer != NULL && UPCASE (0, range->ch) == range->ch
+ && UPCASE (0, cctable[valint]) != cctable[valint])
+ {
+ cctable[valint] = range->ch;
+ }
+ /* Maybe our own case infrastructure is not available yet. Use the C
+ library's. */
+ else if (isupper (range->ch) && !isupper (cctable[valint]))
+ {
+ cctable[valint] = range->ch;
+ }
+ /* Otherwise, save it if this character has a numerically lower value
+ (preferring ASCII over fullwidth Chinese and so on). */
+ else if (range->ch < cctable[valint])
+ {
+ cctable[valint] = range->ch;
+ }
+ }
+
+ return 0;
+}
+
+static Lisp_Object
+build_fixnum_to_char_map (Lisp_Object radix_table)
+{
+ Lisp_Object highest_value, result;
+ struct chartab_range ctr = { CHARTAB_RANGE_ALL, 0, Qnil, 0 };
+ Ichar *cctable;
+ EMACS_INT ii, cclen;
+ Ibyte *data;
+
+ /* What's the greatest fixnum value seen? In passing, check all the char
+ table values are fixnums. */
+ CHECK_FIXNUM (XCHAR_TABLE (radix_table)->default_);
+ highest_value = XFIXNUM (XCHAR_TABLE (radix_table)->default_);
+ map_char_table (radix_table, &ctr, find_highest_value, &highest_value);
+ cclen = XFIXNUM (highest_value) + 1;
+
+ cctable = malloc (sizeof (Ichar) * cclen);
+ if (cctable == NULL)
+ {
+ out_of_memory ("Could not allocate data for `digit-char'",
Qunbound);
+ }
+
+ for (ii = 0; ii < cclen; ++ii)
+ {
+ cctable[ii] = (Ichar) -1;
+ }
+
+ map_char_table (radix_table, &ctr, fill_ichar_array, cctable);
+
+ for (ii = 0; ii < cclen; ++ii)
+ {
+ if (cctable[ii] < 0)
+ {
+ free (cctable);
+ invalid_argument ("No digit specified for weight", make_fixnum (ii));
+ }
+ }
+
+ result = Fmake_string (make_fixnum (cclen * MAX_ICHAR_LEN), make_char (0));
+
+ data = XSTRING_DATA (result);
+ for (ii = 0; ii < cclen; ii++)
+ {
+ set_itext_ichar (data + (MAX_ICHAR_LEN * ii), cctable[ii]);
+ }
+
+ init_string_ascii_begin (result);
+ bump_string_modiff (result);
+ sledgehammer_check_ascii_begin (result);
+
+ free (cctable);
+
+ return result;
+}
+
+DEFUN ("set-digit-fixnum-map", Fset_digit_fixnum_map, 1, 1, 0, /*
+Set the value of `digit-fixnum-map', which see.
+
+Also check that RADIX-TABLE is well-formed from the perspective of
+`parse-integer' and `digit-char-p', and create an internal inverse mapping
+for `digit-char', so that all three functions behave consistently.
+
+RADIX-TABLE itself is not saved, a read-only copy of it is made and returned.
+*/
+ (radix_table))
+{
+ Lisp_Object ftctable = Qnil;
+
+ CHECK_CHAR_TABLE (radix_table);
+
+ /* Create a table for `digit-char', checking the consistency of
+ radix_table while doing so. */
+ ftctable = build_fixnum_to_char_map (radix_table);
+
+ Vdigit_fixnum_map = Fcopy_char_table (radix_table);
+ LISP_READONLY (Vdigit_fixnum_map) = 1;
+ Vfixnum_to_char_map = ftctable;
+
+ return Vdigit_fixnum_map;
+}
+
+DEFUN ("digit-char-p", Fdigit_char_p, 1, 3, 0, /*
+Return non-nil if CHARACTER represents a digit in base RADIX.
+
+RADIX defaults to ten. The actual non-nil value returned is the integer
+value of the character in base RADIX.
+
+RADIX-TABLE, if non-nil, is a character table describing characters' numeric
+values. See `parse-integer' and `digit-fixnum-map'.
+*/
+ (character, radix, radix_table))
+{
+ Lisp_Object got = Qnil;
+ EMACS_INT radixing, val;
+ Ichar cc;
+
+ CHECK_CHAR (character);
+ cc = XCHAR (character);
+
+ if (!NILP (radix))
+ {
+ check_integer_range (radix, Qzero,
+ NILP (radix_table) ?
+ /* If we are using the default radix table, the
+ maximum possible value for the radix is
+ available to us now. */
+ make_fixnum
+ (XSTRING_LENGTH (Vfixnum_to_char_map)
+ / MAX_ICHAR_LEN)
+ /* Otherwise, calculating that is expensive. Check
+ at least that the radix is not a bignum, the
+ maximum count of characters available will not
+ exceed the size of a fixnum. */
+: make_fixnum (MOST_POSITIVE_FIXNUM));
+ radixing = XFIXNUM (radix);
+ }
+ else
+ {
+ radixing = 10;
+ }
+
+ if (NILP (radix_table))
+ {
+ radix_table = Vdigit_fixnum_map;
+ }
+
+ got = get_char_table (cc, radix_table);
+ CHECK_FIXNUM (got);
+ val = XFIXNUM (got);
+
+ if (val < 0 || val >= radixing)
+ {
+ return Qnil;
+ }
+
+ return make_fixnum (val);
+}
+
+DEFUN ("digit-char", Fdigit_char, 1, 3, 0, /*
+Return a character representing the integer WEIGHT in base RADIX.
+
+RADIX defaults to ten. If no such character exists, return nil. `digit-char'
+prefers an upper case character if available. RADIX must be a non-negative
+integer of value less than the maximum value in RADIX-TABLE.
+
+RADIX-TABLE, if non-nil, is a character table describing characters' numeric
+values. It defaults to the value of `digit-fixnum-map'; see the documentation
+for that variable and for `parse-integer'. This is not specified by Common
+Lisp, and using a value other than the default in `digit-char' is expensive,
+since the inverse map needs to be calculated.
+*/
+ (weight, radix, radix_table))
+{
+ EMACS_INT radixing = 10, weighting;
+ Lisp_Object fixnum_to_char_table = Qnil;
+ Ichar cc;
+
+ CHECK_NATNUM (weight);
+
+ if (!NILP (radix_table) && !EQ (radix_table, Vdigit_fixnum_map))
+ {
+ CHECK_CHAR_TABLE (radix_table);
+ /* The result of this isn't GCPROd, but the rest of this function
+ won't GC and continue. */
+ fixnum_to_char_table = build_fixnum_to_char_map (radix_table);
+ }
+ else
+ {
+ fixnum_to_char_table = Vfixnum_to_char_map;
+ }
+
+ if (!NILP (radix))
+ {
+ check_integer_range (radix, Qzero,
+ make_fixnum (XSTRING_LENGTH (fixnum_to_char_table)
+ / MAX_ICHAR_LEN));
+ radixing = XFIXNUM (radix);
+ }
+
+ /* If weight is in its canonical form (and there's no reason to think it
+ isn't), Vfixnum_to_char_map can't be long enough to handle
+ this. */
+ if (BIGNUMP (weight))
+ {
+ return Qnil;
+ }
+
+ weighting = XFIXNUM (weight);
+
+ if (weighting < radixing)
+ {
+ cc = itext_ichar (XSTRING_DATA (fixnum_to_char_table)
+ + MAX_ICHAR_LEN * weighting);
+ return make_char (cc);
+ }
+
+ return Qnil;
+}
+
+Lisp_Object
+parse_integer (const Ibyte *buf, Ibyte **buf_end_out, Bytecount len,
+ EMACS_INT base, Boolint junk_allowed, Lisp_Object radix_table)
+{
+ const Ibyte *lim = buf + len, *p = buf;
+ EMACS_UINT num = 0, onum = (EMACS_UINT) -1;
+ EMACS_UINT fixnum_limit = MOST_POSITIVE_FIXNUM;
+ EMACS_INT cint = 0;
+ Boolint negativland = 0;
+ Ichar c = -1;
+ Lisp_Object result = Qnil, got = Qnil;
+
+ if (NILP (radix_table))
+ {
+ radix_table = Vdigit_fixnum_map;
+ }
+
+ /* This function ignores the current buffer's syntax table.
+ Respecting it will probably introduce more bugs than it fixes. */
+ update_mirror_syntax_if_dirty (XCHAR_TABLE (Vstandard_syntax_table)->
+ mirror_table);
+
+ /* Ignore leading whitespace, if that leading whitespace has no
+ numeric value. */
+ while (p < lim)
+ {
+ c = itext_ichar (p);
+ if (!(((got = get_char_table (c, radix_table), FIXNUMP (got))
+ && ((cint = XFIXNUM (got), cint < 0) || cint >= base))
+ && (SYNTAX (XCHAR_TABLE (Vstandard_syntax_table)->mirror_table,
+ c) == Swhitespace)))
+ {
+ break;
+ }
+
+ INC_IBYTEPTR (p);
+ }
+
+ /* Drop sign information if appropriate. */
+ if (c == '-')
+ {
+ negativland = 1;
+ fixnum_limit = - MOST_NEGATIVE_FIXNUM;
+ INC_IBYTEPTR (p);
+ }
+ else if (c == '+')
+ {
+ got = get_char_table (c, radix_table);
+ cint = FIXNUMP (got) ? XFIXNUM (got) : -1;
+ /* If ?+ has no integer weight, drop it. */
+ if (cint < 0 || cint >= base)
+ {
+ INC_IBYTEPTR (p);
+ }
+ }
+
+ while (p < lim)
+ {
+ c = itext_ichar (p);
+
+ got = get_char_table (c, radix_table);
+ if (!FIXNUMP (got))
+ {
+ goto loser;
+ }
+
+ cint = XFIXNUM (got);
+
+ if (cint < 0 || cint >= base)
+ {
+ goto loser;
+ }
+
+ onum = num;
+ num *= base;
+ if (num > fixnum_limit)
+ {
+ goto overflow;
+ }
+
+ num += cint;
+ if (num > fixnum_limit)
+ {
+ goto overflow;
+ }
+
+ INC_IBYTEPTR (p);
+ }
+
+ if (onum == (EMACS_UINT) -1)
+ {
+ /* No digits seen, we may need to error. */
+ goto loser;
+ }
+
+ if (negativland)
+ {
+ result = make_fixnum (- (EMACS_INT) num);
+ }
+ else
+ {
+ result = make_fixnum (num);
+ }
+
+ *buf_end_out = (Ibyte *) p;
+ return result;
+
+ overflow:
+#ifndef HAVE_BIGNUM
+ return Fsignal (Qinvalid_argument,
+ list3 (build_msg_string ("Integer constant overflow"),
+ make_string (buf, len), make_fixnum (base)));
+
+#else /* HAVE_BIGNUM */
+ result = make_bignum_emacs_uint (onum);
+
+ bignum_set_emacs_int (scratch_bignum, base);
+ bignum_set_emacs_int (scratch_bignum2, cint);
+ bignum_mul (XBIGNUM_DATA (result), XBIGNUM_DATA (result), scratch_bignum);
+ bignum_add (XBIGNUM_DATA (result), XBIGNUM_DATA (result), scratch_bignum2);
+ INC_IBYTEPTR (p);
+
+ assert (!bignum_fits_emacs_int_p (XBIGNUM_DATA (result))
+ || (fixnum_limit
+ < (EMACS_UINT) bignum_to_emacs_int (XBIGNUM_DATA (result))));
+
+ while (p < lim)
+ {
+ c = itext_ichar (p);
+
+ got = get_char_table (c, radix_table);
+ if (!FIXNUMP (got))
+ {
+ goto loser;
+ }
+
+ cint = XFIXNUM (got);
+ if (cint < 0 || cint >= base)
+ {
+ goto loser;
+ }
+
+ bignum_set_emacs_int (scratch_bignum2, cint);
+ bignum_mul (XBIGNUM_DATA (result), XBIGNUM_DATA (result),
+ scratch_bignum);
+ bignum_add (XBIGNUM_DATA (result), XBIGNUM_DATA (result),
+ scratch_bignum2);
+
+ INC_IBYTEPTR (p);
+ }
+
+ if (negativland)
+ {
+ bignum_set_long (scratch_bignum, -1L);
+ bignum_mul (XBIGNUM_DATA (result), XBIGNUM_DATA (result),
+ scratch_bignum);
+ }
+
+ *buf_end_out = (Ibyte *) p;
+ return result;
+#endif /* HAVE_BIGNUM */
+ loser:
+
+ if (p < lim && !junk_allowed)
+ {
+ /* JUNK-ALLOWED is zero. If we have stopped parsing because we
+ encountered whitespace, then we need to check that the rest if the
+ string is whitespace and whitespace alone if we are not to error.
+
+ Perhaps surprisingly, if JUNK-ALLOWED is zero, the parse is regarded
+ as including the trailing whitespace, so the second value returned is
+ always the length of the string. */
+ while (p < lim)
+ {
+ c = itext_ichar (p);
+ if (!(SYNTAX (XCHAR_TABLE (Vstandard_syntax_table)->mirror_table, c)
+ == Swhitespace))
+ {
+ break;
+ }
+
+ INC_IBYTEPTR (p);
+ }
+ }
+
+ *buf_end_out = (Ibyte *) p;
+
+ if (junk_allowed || (p == lim && onum != (EMACS_UINT) -1))
+ {
+
+#ifdef HAVE_BIGNUM
+ if (!NILP (result))
+ {
+ /* Bignum terminated by whitespace or by non-digit. */
+ return Fcanonicalize_number (result);
+ }
+#endif
+
+ if (onum == (EMACS_UINT) -1)
+ {
+ /* No integer digits seen, but junk allowed, so no indication to
+ error. Return nil. */
+ return Qnil;
+ }
+
+ if (negativland)
+ {
+ assert ((- (EMACS_INT) num) >= MOST_NEGATIVE_FIXNUM);
+ result = make_fixnum (- (EMACS_INT) num);
+ }
+ else
+ {
+ assert ((EMACS_INT) num <= MOST_POSITIVE_FIXNUM);
+ result = make_fixnum (num);
+ }
+
+ return result;
+ }
+
+ return Fsignal (Qinvalid_argument,
+ list3 (build_msg_string ("Invalid integer syntax"),
+ make_string (buf, len), make_fixnum (base)));
+}
+
+DEFUN ("parse-integer", Fparse_integer, 1, MANY, 0, /*
+Parse and return the integer represented by STRING using RADIX.
+
+START and END are bounding index designators, as used in `remove*'. START
+defaults to 0 and END defaults to nil, meaning the end of STRING.
+
+If JUNK-ALLOWED is nil, error if STRING does not consist in its entirety of
+the representation of an integer, with or without surrounding whitespace
+characters.
+
+If RADIX-TABLE is non-nil, it is a char table mapping from characters to
+fixnums used with RADIX. Otherwise, `digit-fixnum-map' provides the
+correspondence to use.
+
+RADIX must always be a non-negative fixnum. RADIX-TABLE constrains its
+possible values further, and the maximum RADIX available is always the largest
+positive value available RADIX-TABLE.
+
+arguments: (STRING &key (START 0) end (RADIX 10) junk-allowed radix-table)
+*/
+ (int nargs, Lisp_Object *args))
+{
+ Lisp_Object string = args[0], result;
+ Charcount starting = 0, ending = MOST_POSITIVE_FIXNUM + 1, ii = 0;
+ Bytecount byte_len;
+ Ibyte *startp, *cursor, *end_read, *limit, *saved_start;
+ EMACS_INT radixing;
+
+ PARSE_KEYWORDS (Fparse_integer, nargs, args, 5,
+ (start, end, radix, junk_allowed, radix_table),
+ (start = Qzero, radix = make_fixnum (10)));
+
+ CHECK_STRING (string);
+ CHECK_NATNUM (start);
+ starting = BIGNUMP (start) ? 1 + MOST_POSITIVE_FIXNUM : XFIXNUM (start);
+ if (!NILP (end))
+ {
+ CHECK_NATNUM (end);
+ ending = BIGNUMP (end) ? 1 + MOST_POSITIVE_FIXNUM : XFIXNUM (end);
+ }
+
+ if (!NILP (radix_table))
+ {
+ CHECK_CHAR_TABLE (radix_table);
+ }
+ else
+ {
+ radix_table = Vdigit_fixnum_map;
+ }
+
+ check_integer_range (radix, Qzero,
+ EQ (radix_table, Vdigit_fixnum_map) ?
+ make_fixnum (XSTRING_LENGTH (Vfixnum_to_char_map)
+ / MAX_ICHAR_LEN)
+ /* Non-default radix table; calculating the upper limit
+ is is expensive. Check at least that the radix is
+ not a bignum, the maximum count of characters
+ available in our XEmacs will not exceed the size of
+ a fixnum. */
+: make_fixnum (MOST_POSITIVE_FIXNUM));
+ radixing = XFIXNUM (radix);
+
+ startp = cursor = saved_start = XSTRING_DATA (string);
+ byte_len = XSTRING_LENGTH (string);
+ limit = startp + byte_len;
+
+ while (cursor < limit && ii < ending)
+ {
+ INC_IBYTEPTR (cursor);
+ if (ii < starting)
+ {
+ startp = cursor;
+ }
+ ii++;
+ }
+
+ if (ii < starting || (ii < ending && !NILP (end)))
+ {
+ check_sequence_range (string, start, end, Flength (string));
+ }
+
+ result = parse_integer (startp, &end_read, cursor - startp, radixing,
+ !NILP (junk_allowed), radix_table);
+
+ /* This code hasn't been written to handle relocating string data. */
+ assert (saved_start == XSTRING_DATA (string));
+
+ return values2 (result, make_fixnum (string_index_byte_to_char
+ (string, end_read - saved_start)));
+}
-
DEFUN ("+", Fplus, 0, MANY, 0, /*
Return sum of any number of arguments.
The arguments should all be numbers, characters or markers.
@@ -3539,6 +4118,10 @@
DEFSYMBOL_MULTIWORD_PREDICATE (Qweak_listp);
DEFSYMBOL (Qfloatp);
+ DEFKEYWORD (Q_radix);
+ DEFKEYWORD (Q_junk_allowed);
+ DEFKEYWORD (Q_radix_table);
+
DEFSUBR (Fwrong_type_argument);
#ifdef HAVE_RATIO
@@ -3595,6 +4178,10 @@
DEFSUBR (Fnumber_to_string);
DEFSUBR (Fstring_to_number);
+ DEFSUBR (Fset_digit_fixnum_map);
+ DEFSUBR (Fdigit_char_p);
+ DEFSUBR (Fdigit_char);
+ DEFSUBR (Fparse_integer);
DEFSUBR (Feqlsign);
DEFSUBR (Flss);
DEFSUBR (Fgtr);
@@ -3659,6 +4246,53 @@
*/);
Vmost_positive_fixnum = MOST_POSITIVE_FIXNUM;
+ DEFVAR_CONST_LISP ("digit-fixnum-map", &Vdigit_fixnum_map /*
+Table used to determine a character's numeric value when parsing.
+
+This is a character table with fixnum values. A value of -1 indicates this
+character does not have an assigned numeric value. See `parse-integer',
+`digit-char-p', and `digit-char'.
+*/);
+ Vdigit_fixnum_map = Fmake_char_table (Qgeneric);
+ set_char_table_default (Vdigit_fixnum_map, make_fixnum (-1));
+ {
+ int ii = 0;
+
+ for (ii = 0; ii < 10; ++ii)
+ {
+ XCHAR_TABLE (Vdigit_fixnum_map)->ascii['0' + ii] = make_fixnum(ii);
+ }
+
+ for (ii = 10; ii < 36; ++ii)
+ {
+ XCHAR_TABLE (Vdigit_fixnum_map)->ascii['a' + (ii - 10)]
+ = make_fixnum(ii);
+ XCHAR_TABLE (Vdigit_fixnum_map)->ascii['A' + (ii - 10)]
+ = make_fixnum(ii);
+ }
+ }
+ {
+ Ascbyte *fixnum_tab = alloca_ascbytes (36 * MAX_ICHAR_LEN), *ptr;
+ int ii;
+ Ichar cc;
+ memset ((void *)fixnum_tab, 0, 36 * MAX_ICHAR_LEN);
+
+ /* The whole point of fixnum_to_character_table is access as an array,
+ avoid O(N) issues by giving every character MAX_ICHAR_LEN of
+ bytes. */
+ for (ii = 0, ptr = fixnum_tab; ii < 36; ++ii, ptr += MAX_ICHAR_LEN)
+ {
+ cc = ii < 10 ? '0' + ii : 'A' + (ii - 10);
+ set_itext_ichar ((Ibyte *) ptr, cc);
+ }
+
+ /* Sigh, we can't call build_fixnum_to_char_map() on Vdigit_fixnum_map,
+ this is too early in the boot sequence to map across a char table. Do
+ it by hand. */
+ Vfixnum_to_char_map = build_ascstring (fixnum_tab);
+ staticpro (&Vfixnum_to_char_map);
+ }
+
#ifdef DEBUG_XEMACS
DEFVAR_BOOL ("debug-issue-ebola-notices", &debug_issue_ebola_notices /*
If non-zero, note when your code may be suffering from char-int confoundance.
diff -r 83e5c3cd6be6 -r 750fab17b299 src/lisp.h
--- a/src/lisp.h Sat Jan 10 19:43:28 2015 +0900
+++ b/src/lisp.h Wed Feb 25 11:47:12 2015 +0000
@@ -4594,6 +4594,10 @@
Lisp_Object word_to_lisp (unsigned int);
unsigned int lisp_to_word (Lisp_Object);
+Lisp_Object parse_integer (const Ibyte *buf, Ibyte **buf_end_out,
+ Bytecount len, EMACS_INT base,
+ Boolint junk_allowed, Lisp_Object base_table);
+
extern Lisp_Object Qarrayp, Qbitp, Qchar_or_string_p, Qcharacterp,
Qerror_conditions, Qerror_message, Qinteger_char_or_marker_p,
Qinteger_or_char_p, Qinteger_or_marker_p, Qlambda, Qlistp, Qnatnump,
diff -r 83e5c3cd6be6 -r 750fab17b299 src/lread.c
--- a/src/lread.c Sat Jan 10 19:43:28 2015 +0900
+++ b/src/lread.c Wed Feb 25 11:47:12 2015 +0000
@@ -1922,8 +1922,6 @@
return Lstream_byte_count (XLSTREAM (Vread_buffer_stream)) - 1;
}
-static Lisp_Object parse_integer (const Ibyte *buf, Bytecount len, int base);
-
static Lisp_Object
read_atom (Lisp_Object readcharfun,
Ichar firstchar,
@@ -1958,23 +1956,15 @@
p1++;
if (p1 == p)
{
+ Ibyte *buf_end;
/* It is an integer. */
if (p1[-1] == '.')
- p1[-1] = '\0';
-#if 0
- {
- int number = 0;
- if (sizeof (int) == sizeof (EMACS_INT))
- number = atoi (read_buffer);
- else if (sizeof (long) == sizeof (EMACS_INT))
- number = atol (read_buffer);
- else
- ABORT ();
- return make_fixnum (number);
- }
-#else
- return parse_integer ((Ibyte *) read_ptr, len, 10);
-#endif
+ {
+ len -= 1;
+ }
+
+ return parse_integer ((Ibyte *) read_ptr, &buf_end, len, 10,
+ 0, Qnil);
}
}
#ifdef HAVE_RATIO
@@ -2011,97 +2001,16 @@
}
}
-
-static Lisp_Object
-parse_integer (const Ibyte *buf, Bytecount len, int base)
-{
- const Ibyte *lim = buf + len;
- const Ibyte *p = buf;
- EMACS_UINT num = 0;
- int negativland = 0;
-
- if (*p == '-')
- {
- negativland = 1;
- p++;
- }
- else if (*p == '+')
- {
- p++;
- /* GMP deals with a leading plus sign, badly, make sure it doesn't see
- it. */
- buf++;
- }
-
- if (p == lim)
- goto loser;
-
- for (; (p < lim) && (*p != '\0'); p++)
- {
- int c = *p;
- EMACS_UINT onum;
-
- if (isdigit (c))
- c = c - '0';
- else if (isupper (c))
- c = c - 'A' + 10;
- else if (islower (c))
- c = c - 'a' + 10;
- else
- goto loser;
-
- if (c < 0 || c >= base)
- goto loser;
-
- onum = num;
- num = num * base + c;
- if (num < onum)
- goto overflow;
- }
-
- {
- EMACS_INT int_result = negativland ? - (EMACS_INT) num : (EMACS_INT) num;
- Lisp_Object result = make_fixnum (int_result);
- if (num && ((XFIXNUM (result) < 0) != negativland))
- goto overflow;
- if (XFIXNUM (result) != int_result)
- goto overflow;
- return result;
- }
- overflow:
-#ifdef HAVE_BIGNUM
- {
- bignum_set_string (scratch_bignum, (const char *) buf, base);
- return make_bignum_bg (scratch_bignum);
- }
-#else
- return Fsignal (Qinvalid_read_syntax,
- list3 (build_msg_string
- ("Integer constant overflow in reader"),
- make_string (buf, len),
- make_fixnum (base)));
-#endif /* HAVE_BIGNUM */
- loser:
- return Fsignal (Qinvalid_read_syntax,
- list3 (build_msg_string
- ("Invalid integer constant in reader"),
- make_string (buf, len),
- make_fixnum (base)));
-}
-
-
static Lisp_Object
read_integer (Lisp_Object readcharfun, int base)
{
/* This function can GC */
int saw_a_backslash;
+ Ibyte *buf_end;
Bytecount len = read_atom_0 (readcharfun, -1, &saw_a_backslash);
return (parse_integer
(resizing_buffer_stream_ptr (XLSTREAM (Vread_buffer_stream)),
- ((saw_a_backslash)
- ? 0 /* make parse_integer signal error */
- : len),
- base));
+ &buf_end, len, base, 0, Qnil));
}
static Lisp_Object
@@ -2700,6 +2609,7 @@
/* Reader forms that can reuse previously read objects. */
{
Lisp_Object parsed, found;
+ Ibyte *buf_end;
Lstream_rewind (XLSTREAM (Vread_buffer_stream));
@@ -2718,10 +2628,10 @@
parsed
= parse_integer (resizing_buffer_stream_ptr
- (XLSTREAM (Vread_buffer_stream)),
+ (XLSTREAM (Vread_buffer_stream)), &buf_end,
Lstream_byte_count (XLSTREAM
(Vread_buffer_stream))
- - 1, 10);
+ - 1, 10, 0, Qnil);
found = assoc_no_quit (parsed, Vread_objects);
if (c == '=')
diff -r 83e5c3cd6be6 -r 750fab17b299 src/number.h
--- a/src/number.h Sat Jan 10 19:43:28 2015 +0900
+++ b/src/number.h Wed Feb 25 11:47:12 2015 +0000
@@ -101,12 +101,18 @@
#if SIZEOF_EMACS_INT == SIZEOF_LONG
# define bignum_fits_emacs_int_p(b) bignum_fits_long_p(b)
# define bignum_to_emacs_int(b) bignum_to_long(b)
+# define bignum_set_emacs_int bignum_set_long
+# define make_bignum_emacs_uint(b) make_bignum_un(b)
#elif SIZEOF_EMACS_INT == SIZEOF_INT
# define bignum_fits_emacs_int_p(b) bignum_fits_int_p(b)
# define bignum_to_emacs_int(b) bignum_to_int(b)
+# define bignum_set_emacs_int bignum_set_long
+# define make_bignum_emacs_uint(b) make_bignum_un(b)
#else
# define bignum_fits_emacs_int_p(b) bignum_fits_llong_p(b)
# define bignum_to_emacs_int(b) bignum_to_llong(b)
+# define bignum_set_emacs_int bignum_set_llong
+# define make_bignum_emacs_uint(b) make_bignum_ull(b)
#endif
extern Lisp_Object make_bignum (long);
diff -r 83e5c3cd6be6 -r 750fab17b299 tests/ChangeLog
--- a/tests/ChangeLog Sat Jan 10 19:43:28 2015 +0900
+++ b/tests/ChangeLog Wed Feb 25 11:47:12 2015 +0000
@@ -1,3 +1,13 @@
+2015-02-25 Aidan Kehoe <kehoea(a)parhasard.net>
+
+ * automated/lisp-tests.el:
+ parse_integer(), used in #'read, now signals invalid-argument
+ rather than invalid-read-syntax, check for that.
+ * automated/lisp-tests.el:
+ Check #'parse-integer now it's available to Lisp, check
+ #'digit-char, #'digit-char-p and the congruence in behaviour,
+ check the XEmacs-specific RADIX-TABLE argument behaviour.
+
2014-10-11 Stephen J. Turnbull <stephen(a)xemacs.org>
* automated/keymap-tests.el:
diff -r 83e5c3cd6be6 -r 750fab17b299 tests/automated/lisp-tests.el
--- a/tests/automated/lisp-tests.el Sat Jan 10 19:43:28 2015 +0900
+++ b/tests/automated/lisp-tests.el Wed Feb 25 11:47:12 2015 +0000
@@ -1473,8 +1473,8 @@
(progn
(Check-Error wrong-type-argument (format "%u" most-negative-fixnum))
(Check-Error wrong-type-argument (format "%u" -1)))
- (Check-Error invalid-read-syntax (read (format "%u" most-negative-fixnum)))
- (Check-Error invalid-read-syntax (read (format "%u" -1))))
+ (Check-Error invalid-argument (read (format "%u" most-negative-fixnum)))
+ (Check-Error invalid-argument (read (format "%u" -1))))
;; Check all-completions ignore element start with space.
(Assert (not (all-completions "" '((" hidden" .
"object")))))
@@ -3451,4 +3451,126 @@
(test-write-string write-string :sequences-too nil)
(test-write-string write-line :worry-about-newline t :sequences-too nil))
+;;-----------------------------------------------------
+;; Test #'parse-integer and friends.
+;;-----------------------------------------------------
+
+(Check-Error wrong-type-argument (parse-integer 123456789))
+(Check-Error wrong-type-argument (parse-integer "123456789" :start -1))
+(if (featurep 'bignum)
+ (progn
+ (Check-Error args-out-of-range
+ (parse-integer "123456789" :start (1+
most-positive-fixnum)))
+ (Check-Error args-out-of-range
+ (parse-integer "123456789" :end (1+
most-positive-fixnum))))
+ (Check-Error wrong-type-argument
+ (parse-integer "123456789" :start (1+ most-positive-fixnum)))
+ (Check-Error wrong-type-argument
+ (parse-integer "123456789" :end (1+ most-positive-fixnum))))
+
+(Check-Error args-out-of-range (parse-integer "123456789" :radix -1))
+(Check-Error args-out-of-range
+ (parse-integer "123456789" :radix (1+ most-positive-fixnum)))
+(Check-Error wrong-number-of-arguments
+ (parse-integer "123456789" :junk-allowed))
+(Check-Error invalid-keyword-argument
+ (parse-integer "123456789" :no-such-keyword t))
+
+;; Next two paragraphs of tests from GNU, thank you Leo Liu.
+(Assert (eql (digit-char-p ?3) 3))
+(Assert (eql (digit-char-p ?a 11) 10))
+(Assert (eql (digit-char-p ?w 36) 32))
+(Assert (not (digit-char-p ?a)))
+(Check-Error args-out-of-range (digit-char-p ?a 37))
+(Assert (not (digit-char-p ?a 1)))
+
+(Assert (equal (multiple-value-list (parse-integer " -123 ")) '(-123 7)))
+(Assert (equal (multiple-value-list
+ (parse-integer "-efz" :radix 16 :junk-allowed t))
+ '(-239 3)))
+(Assert (equal (multiple-value-list (parse-integer "zzef" :radix 16 :start 2))
+ '(239 4)))
+(Assert (equal (multiple-value-list
+ (parse-integer "0123456789" :radix 8 :junk-allowed t))
+ '(342391 8)))
+(Assert (equal (multiple-value-list (parse-integer "" :junk-allowed t))
+ '(nil 0)))
+(Assert (equal (multiple-value-list (parse-integer "abc" :junk-allowed t))
+ '(nil 0)))
+(Check-Error invalid-argument (parse-integer "0123456789" :radix 8))
+(Check-Error invalid-argument (parse-integer "abc"))
+(Check-Error invalid-argument (parse-integer "efz" :radix 16))
+
+;; We don't allow a trailing decimal point, as the Lisp reader does.
+(Check-Error invalid-argument (parse-integer "12348."))
+
+;; In contravention of Common Lisp, we allow both 0 and 1 as values for RADIX,
+;; useless as that is.
+(Assert (equal (multiple-value-list (parse-integer "00000" :radix 1)) '(0
5))
+ "checking 1 is allowed as a value for RADIX")
+(Assert (equal (multiple-value-list
+ (parse-integer "" :radix 0 :junk-allowed t))
+ '(nil 0))
+ "checking 0 is allowed as a value for RADIX")
+
+(let ((binary-table
+ (copy-char-table #s(char-table :type generic :default -1 :data ()))))
+ (loop for fixnum from 00 to #xff
+ do (put-char-table (int-char fixnum) fixnum binary-table))
+ (Assert (eql most-positive-fixnum
+ (parse-integer
+ (concatenate 'string "\x3f"
+ (make-string
+ (/ (- (integer-length most-positive-fixnum)
+ (integer-length #x3f)) 8)
+ ?\xff))
+:radix-table binary-table :radix #x100))
+ "checking parsing text using base 256 (big endian binary) works")
+ (Assert (equal
+ (multiple-value-list
+ (parse-integer " \1\7\1\7 " :radix-table binary-table))
+ '(1717 6))
+ "checking whitespace treated as such when it is not < radix")
+ (Assert (equal
+ (multiple-value-list
+ (parse-integer " \1\7\1\7 " :radix-table binary-table
+:junk-allowed t))
+ '(1717 5))
+ "checking whitespace treated as junk when it is not < radix")
+ (Check-Error invalid-argument
+ (parse-integer "1234" :radix-table binary-table))
+ (Assert (equal
+ (multiple-value-list
+ (parse-integer "--" :radix-table binary-table :radix #x100))
+ '(-45 2))
+ "checking ?- always treated as minus sign initially")
+ (Assert (equal
+ (multiple-value-list
+ (parse-integer "+20" :radix-table binary-table :radix #x100))
+ '(2830896 3))
+ "checking ?+ not dropped initially if it has integer weight")
+ (Assert (eql #xff (digit-char-p ?� #x100 binary-table))
+ "checking `digit-char-p' behaves correctly with base 256")
+ (Assert (eql ?\xff (digit-char #xff #x100 binary-table))
+ "checking `digit-char' behaves correctly with base 256")
+ (Assert (eql (parse-integer " " :radix-table binary-table :radix #x100)
+ #x20)
+ "checking whitespace not treated as such when it has fixnum weight")
+ (Assert (null (digit-char-p ?0 nil binary-table))
+ "checking `digit-char-p' reflects RADIX-TABLE, ?0")
+ (Assert (null (digit-char-p ?9 nil binary-table))
+ "checking `digit-char-p' reflects RADIX-TABLE, ?9")
+ (Assert (null (digit-char-p ?a 16 binary-table))
+ "checking `digit-char-p' reflects RADIX-TABLE, ?a")
+ (Assert (eql ?� (digit-char #xff #x100 binary-table))
+ "checking `digit-char' reflects RADIX-TABLE, #xff")
+ (Assert (eql ?a (digit-char #x61 #x100 binary-table))
+ "checking `digit-char' reflects RADIX-TABLE, #x61")
+ (Assert (null (digit-char #xff nil binary-table))
+ "checking `digit-char' reflects RADIX-TABLE, #xff, base 10")
+ (Assert (eql ?\x0a (digit-char 10 16 binary-table))
+ "checking `digit-char' reflects RADIX-TABLE, 10, base 16")
+ (Assert (eql ?\x09 (digit-char 9 nil binary-table))
+ "checking `digit-char' reflects RADIX-TABLE, 9, base 10"))
+
;;; end of lisp-tests.el
--
‘Tramadol is further fed to cattle […] when working them […] (as draft
animals) so that the animals do not get tired quickly. …’
— Angewandte Chemie, Sept 2014, describing the social context of
(synthetic) tramadol having been found in Cameroon tree roots.
_______________________________________________
XEmacs-Patches mailing list
XEmacs-Patches(a)xemacs.org
http://lists.xemacs.org/mailman/listinfo/xemacs-patches