changeset: 4644:e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f
user: Aidan Kehoe <kehoea(a)parhasard.net>
date: Sat Feb 07 17:13:37 2009 +0000
files: lisp/ChangeLog lisp/coding.el lisp/mule/arabic.el lisp/mule/cyrillic.el
lisp/mule/greek.el lisp/mule/hebrew.el lisp/mule/latin.el lisp/mule/mule-cmds.el
lisp/mule/mule-coding.el lisp/mule/vietnamese.el lisp/unicode.el tests/ChangeLog
tests/automated/query-coding-tests.el
description:
Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
lisp/ChangeLog addition:
2009-02-07 Aidan Kehoe <kehoea(a)parhasard.net>
* coding.el (query-coding-clear-highlights):
Rename the BUFFER argument to BUFFER-OR-STRING, describe it as
possibly being a string in its documentation.
(default-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document that this
function does not support it.
Bind case-fold-search to nil, we don't want this to influence what the
function thinks is encodable or not.
(query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does; reflect this new argument in the associated compiler macro.
(query-coding-string):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does. Support the HIGHLIGHT argument correctly.
* unicode.el (unicode-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does, implement this. Document a potential problem.
Use #'query-coding-clear-highlights instead of reimplementing it
ourselves.
Remove some debugging messages.
* mule/arabic.el (iso-8859-6):
* mule/cyrillic.el (iso-8859-5):
* mule/greek.el (iso-8859-7):
* mule/hebrew.el (iso-8859-8):
* mule/latin.el (iso-8859-2):
* mule/latin.el (iso-8859-3):
* mule/latin.el (iso-8859-4):
* mule/latin.el (iso-8859-14):
* mule/latin.el (iso-8859-15):
* mule/latin.el (iso-8859-16):
* mule/latin.el (iso-8859-9):
* mule/latin.el (windows-1252):
* mule/mule-coding.el (iso-8859-1):
Avoid the assumption that characters not given an explicit mapping
in these coding systems map to the ISO 8859-1 characters
corresponding to the octets on disk; this makes it much more
reasonable to implement the IGNORE-INVALID-SEQUENCESP argument to
query-coding-region.
* mule/mule-cmds.el (set-language-info):
Correct the docstring.
* mule/mule-cmds.el (finish-set-language-environment):
Treat invalid Unicode sequences produced from
invalid-sequence-coding-system and corresponding to control
characters the same as control characters in redisplay.
* mule/mule-cmds.el:
Document that encode-coding-char is available in coding.el
* mule/mule-coding.el (make-8-bit-generate-helper):
Change to return the both the encode-program generated and the
relevant non-ASCII charset; update the docstring to reflect this.
* mule/mule-coding.el
(make-8-bit-generate-encode-program-and-skip-chars-strings):
Rename this function; have it return skip-chars-strings as well as
the encode program. Have these skip-chars-strings use ranges for
charsets, where possible.
* mule/mule-coding.el (make-8-bit-create-decode-encode-tables):
Revise this to allow people to specify explicitly characters that
should be undefined (= corresponding to keys in
unicode-error-default-translation-table), and treating unspecified
octets above #x7f as undefined by default.
* mule/mule-coding.el (8-bit-fixed-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, implement support
for it using the 8-bit-fixed-invalid-sequences-skip-chars coding
system property; remove some debugging messages.
* mule/mule-coding.el (make-8-bit-coding-system):
This function is dumped, autoloading it makes no sense.
Document what happens when characters above #x7f are not
specified, implement this.
* mule/vietnamese.el:
Correct spelling.
tests/ChangeLog addition:
2009-02-07 Aidan Kehoe <kehoea(a)parhasard.net>
* automated/query-coding-tests.el:
Add FAILING-CASE arguments to the Assert calls, making #'q-c-debug
mostly unnecessary. Remove #'q-c-debug.
Add new tests that use the IGNORE-INVALID-SEQUENCESP argument to
#'query-coding-region; rework the existing ones to respect it.
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/ChangeLog
--- a/lisp/ChangeLog Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/ChangeLog Sat Feb 07 17:13:37 2009 +0000
@@ -1,3 +1,75 @@ 2009-02-04 Aidan Kehoe <kehoea@parhasa
+2009-02-07 Aidan Kehoe <kehoea(a)parhasard.net>
+
+ * coding.el (query-coding-clear-highlights):
+ Rename the BUFFER argument to BUFFER-OR-STRING, describe it as
+ possibly being a string in its documentation.
+ (default-query-coding-region):
+ Add a new IGNORE-INVALID-SEQUENCESP argument, document that this
+ function does not support it.
+ Bind case-fold-search to nil, we don't want this to influence what the
+ function thinks is encodable or not.
+ (query-coding-region):
+ Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
+ does; reflect this new argument in the associated compiler macro.
+ (query-coding-string):
+ Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
+ does. Support the HIGHLIGHT argument correctly.
+ * unicode.el (unicode-query-coding-region):
+ Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
+ does, implement this. Document a potential problem.
+ Use #'query-coding-clear-highlights instead of reimplementing it
+ ourselves.
+ Remove some debugging messages.
+ * mule/arabic.el (iso-8859-6):
+ * mule/cyrillic.el (iso-8859-5):
+ * mule/greek.el (iso-8859-7):
+ * mule/hebrew.el (iso-8859-8):
+ * mule/latin.el (iso-8859-2):
+ * mule/latin.el (iso-8859-3):
+ * mule/latin.el (iso-8859-4):
+ * mule/latin.el (iso-8859-14):
+ * mule/latin.el (iso-8859-15):
+ * mule/latin.el (iso-8859-16):
+ * mule/latin.el (iso-8859-9):
+ * mule/latin.el (windows-1252):
+ * mule/mule-coding.el (iso-8859-1):
+ Avoid the assumption that characters not given an explicit mapping
+ in these coding systems map to the ISO 8859-1 characters
+ corresponding to the octets on disk; this makes it much more
+ reasonable to implement the IGNORE-INVALID-SEQUENCESP argument to
+ query-coding-region.
+ * mule/mule-cmds.el (set-language-info):
+ Correct the docstring.
+ * mule/mule-cmds.el (finish-set-language-environment):
+ Treat invalid Unicode sequences produced from
+ invalid-sequence-coding-system and corresponding to control
+ characters the same as control characters in redisplay.
+ * mule/mule-cmds.el:
+ Document that encode-coding-char is available in coding.el
+ * mule/mule-coding.el (make-8-bit-generate-helper):
+ Change to return the both the encode-program generated and the
+ relevant non-ASCII charset; update the docstring to reflect this.
+ * mule/mule-coding.el
+ (make-8-bit-generate-encode-program-and-skip-chars-strings):
+ Rename this function; have it return skip-chars-strings as well as
+ the encode program. Have these skip-chars-strings use ranges for
+ charsets, where possible.
+ * mule/mule-coding.el (make-8-bit-create-decode-encode-tables):
+ Revise this to allow people to specify explicitly characters that
+ should be undefined (= corresponding to keys in
+ unicode-error-default-translation-table), and treating unspecified
+ octets above #x7f as undefined by default.
+ * mule/mule-coding.el (8-bit-fixed-query-coding-region):
+ Add a new IGNORE-INVALID-SEQUENCESP argument, implement support
+ for it using the 8-bit-fixed-invalid-sequences-skip-chars coding
+ system property; remove some debugging messages.
+ * mule/mule-coding.el (make-8-bit-coding-system):
+ This function is dumped, autoloading it makes no sense.
+ Document what happens when characters above #x7f are not
+ specified, implement this.
+ * mule/vietnamese.el:
+ Correct spelling.
+
2009-02-04 Aidan Kehoe <kehoea(a)parhasard.net>
* help.el:
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/coding.el
--- a/lisp/coding.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/coding.el Sat Feb 07 17:13:37 2009 +0000
@@ -288,11 +288,11 @@ alias, though we haven't profiled this y
#s(hash-table test equal data ())
"A map from list of charsets to `skip-chars-forward' arguments for
them.")
-(defsubst query-coding-clear-highlights (begin end &optional buffer)
+(defsubst query-coding-clear-highlights (begin end &optional buffer-or-string)
"Remove extent faces added by `query-coding-region' between BEGIN and END.
-Optional argument BUFFER is the buffer to use, and defaults to the current
-buffer.
+Optional argument BUFFER-OR-STRING is the buffer or string to use, and
+defaults to the current buffer.
The HIGHLIGHTP argument to `query-coding-region' indicates that it should
display unencodable characters using `query-coding-warning-face'. After
@@ -300,16 +300,19 @@ this function has been called, this will
(map-extents #'(lambda (extent ignored-arg)
(when (eq 'query-coding-warning-face
(extent-face extent))
- (delete-extent extent))) buffer begin end))
+ (delete-extent extent))) buffer-or-string begin end))
(defun* default-query-coding-region (begin end coding-system
- &optional buffer errorp highlightp)
+ &optional buffer ignore-invalid-sequencesp
+ errorp highlightp)
"The default `query-coding-region' implementation.
Uses the `safe-charsets' and `safe-chars' coding system properties.
The former is a list of XEmacs character sets that can be safely
encoded by CODING-SYSTEM; the latter a char table describing, in
-addition, characters that can be safely encoded by CODING-SYSTEM."
+addition, characters that can be safely encoded by CODING-SYSTEM.
+
+Does not support IGNORE-INVALID-SEQUENCESP."
(check-argument-type #'coding-system-p
(setq coding-system (find-coding-system coding-system)))
(check-argument-type #'integer-or-marker-p begin)
@@ -326,6 +329,7 @@ addition, characters that can be safely
(gethash safe-charsets
default-query-coding-region-safe-charset-skip-chars-map))
(ranges (make-range-table))
+ (case-fold-search nil)
fail-range-start fail-range-end char-after
looking-at-arg failed extent)
;; Coding systems with a value of t for safe-charsets support everything.
@@ -401,70 +405,122 @@ addition, characters that can be safely
(values t nil))))))
(defun query-coding-region (start end coding-system &optional buffer
- errorp highlight)
+ ignore-invalid-sequencesp errorp highlight)
"Work out whether CODING-SYSTEM can losslessly encode a region.
START and END are the beginning and end of the region to check.
CODING-SYSTEM is the coding system to try.
Optional argument BUFFER is the buffer to check, and defaults to the current
-buffer. Optional argument ERRORP says to signal a `text-conversion-error'
-if some character in the region cannot be encoded, and defaults to nil.
+buffer.
+
+IGNORE-INVALID-SEQUENCESP, also an optional argument, says to treat XEmacs
+characters which have an unambiguous encoded representation, despite being
+undefined in what they represent, as encodable. These chiefly arise with
+variable-length encodings like UTF-8 and UTF-16, where an invalid sequence
+is passed through to XEmacs as a sequence of characters with a defined
+correspondence to the octets on disk, but no non-error semantics; see the
+`invalid-sequence-coding-system' argument to `set-language-info'.
+
+They can also arise with fixed-length encodings like ISO 8859-7, where
+certain octets on disk have undefined values, and treating them as
+corresponding to the ISO 8859-1 characters with the same numerical values
+may lead to data that is not understood by other applications.
+
+Optional argument ERRORP says to signal a `text-conversion-error' if some
+character in the region cannot be encoded, and defaults to nil.
Optional argument HIGHLIGHT says to display unencodable characters in the
region using `query-coding-warning-face'. It defaults to nil.
-This function returns a list; the intention is that callers use
+This function returns a list; the intention is that callers use
`multiple-value-bind' or the related CL multiple value functions to deal
with it. The first element is `t' if the region can be encoded using
CODING-SYSTEM, or `nil' if not. The second element is `nil' if the region
can be encoded using CODING-SYSTEM; otherwise, it is a range table
-describing the positions of the unencodable characters. See
-`make-range-table'."
+describing the positions of the unencodable characters. Ranges that
+describe characters that would be ignored were IGNORE-INVALID-SEQUENCESP
+non-nil map to the symbol `invalid-sequence'; other ranges map to the symbol
+`unencodable'. If IGNORE-INVALID-SEQUENCESP is non-nil, all ranges will map
+to the symbol `unencodable'. See `make-range-table' for more details of
+range tables."
(funcall (or (coding-system-get coding-system 'query-coding-function)
#'default-query-coding-region)
- start end coding-system buffer errorp highlight))
+ start end coding-system buffer ignore-invalid-sequencesp errorp
+ highlight))
(define-compiler-macro query-coding-region (start end coding-system
- &optional buffer errorp highlight)
+ &optional buffer
+ ignore-invalid-sequencesp
+ errorp highlight)
`(funcall (or (coding-system-get ,coding-system 'query-coding-function)
#'default-query-coding-region)
- ,start ,end ,coding-system ,@(append (if buffer (list buffer))
- (if errorp (list errorp))
- (if highlight (list highlight)))))
-
-(defun query-coding-string (string coding-system &optional errorp highlight)
+ ,start ,end ,coding-system ,@(append (when (or buffer
+ ignore-invalid-sequencesp
+ errorp highlight)
+ (list buffer))
+ (when (or ignore-invalid-sequencesp
+ errorp highlight)
+ (list ignore-invalid-sequencesp))
+ (when (or errorp highlight)
+ (list errorp))
+ (when highlight (list highlight)))))
+
+(defun query-coding-string (string coding-system &optional
+ ignore-invalid-sequencesp errorp highlight)
"Work out whether CODING-SYSTEM can losslessly encode STRING.
CODING-SYSTEM is the coding system to check.
+IGNORE-INVALID-SEQUENCESP, an optional argument, says to treat XEmacs
+characters which have an unambiguous encoded representation, despite being
+undefined in what they represent, as encodable. These chiefly arise with
+variable-length encodings like UTF-8 and UTF-16, where an invalid sequence
+is passed through to XEmacs as a sequence of characters with a defined
+correspondence to the octets on disk, but no non-error semantics; see the
+`invalid-sequence-coding-system' argument to `set-language-info'.
+
+They can also arise with fixed-length encodings like ISO 8859-7, where
+certain octets on disk have undefined values, and treating them as
+corresponding to the ISO 8859-1 characters with the same numerical values
+may lead to data that is not understood by other applications.
+
Optional argument ERRORP says to signal a `text-conversion-error' if some
character in the region cannot be encoded, and defaults to nil.
Optional argument HIGHLIGHT says to display unencodable characters in the
region using `query-coding-warning-face'. It defaults to nil.
-This function returns a list; the intention is that callers use use
+This function returns a list; the intention is that callers use
`multiple-value-bind' or the related CL multiple value functions to deal
-with it. The first element is `t' if the string can be encoded using
-CODING-SYSTEM, or `nil' if not. The second element is `nil' if the string
+with it. The first element is `t' if the region can be encoded using
+CODING-SYSTEM, or `nil' if not. The second element is `nil' if the region
can be encoded using CODING-SYSTEM; otherwise, it is a range table
-describing the positions of the unencodable characters. See
-`make-range-table'."
+describing the positions of the unencodable characters. Ranges that
+describe characters that would be ignored were IGNORE-INVALID-SEQUENCESP
+non-nil map to the symbol `invalid-sequence'; other ranges map to the symbol
+`unencodable'. If IGNORE-INVALID-SEQUENCESP is non-nil, all ranges will map
+to the symbol `unencodable'. See `make-range-table' for more details of
+range tables."
(with-temp-buffer
+ (when highlight
+ (query-coding-clear-highlights 0 (length string) string))
(insert string)
- (multiple-value-bind (result ranges)
+ (multiple-value-bind (result ranges extent)
(query-coding-region (point-min) (point-max) coding-system
(current-buffer) errorp
- ;; #### Highlight won't work here,
- ;; query-coding-region may need to be modified.
- highlight)
+ nil ignore-invalid-sequencesp)
(unless result
- ;; Sigh, string indices are zero-based, buffer offsets are
- ;; one-based.
(map-range-table
#'(lambda (begin end value)
+ ;; Sigh, string indices are zero-based, buffer offsets are
+ ;; one-based.
(remove-range-table begin end ranges)
- (put-range-table (1- begin) (1- end) value ranges))
+ (put-range-table (decf begin) (decf end) value ranges)
+ (when highlight
+ (setq extent (make-extent begin end string))
+ (set-extent-priority extent (+ mouse-highlight-priority 2))
+ (set-extent-property extent 'duplicable t)
+ (set-extent-face extent 'query-coding-warning-face)))
ranges))
(values result ranges))))
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/mule/arabic.el
--- a/lisp/mule/arabic.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/mule/arabic.el Sat Feb 07 17:13:37 2009 +0000
@@ -33,7 +33,39 @@
(make-8-bit-coding-system
'iso-8859-6
- '((#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
(#xA4 ?\u00A4) ;; CURRENCY SIGN
(#xAC ?\u060C) ;; ARABIC COMMA
(#xAD ?\u00AD) ;; SOFT HYPHEN
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/mule/cyrillic.el
--- a/lisp/mule/cyrillic.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/mule/cyrillic.el Sat Feb 07 17:13:37 2009 +0000
@@ -108,7 +108,40 @@
;; And create the coding system.
(make-8-bit-coding-system
'iso-8859-5
- '((#xA1 ?\u0401) ;; CYRILLIC CAPITAL LETTER IO
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u0401) ;; CYRILLIC CAPITAL LETTER IO
(#xA2 ?\u0402) ;; CYRILLIC CAPITAL LETTER DJE
(#xA3 ?\u0403) ;; CYRILLIC CAPITAL LETTER GJE
(#xA4 ?\u0404) ;; CYRILLIC CAPITAL LETTER UKRAINIAN IE
@@ -120,6 +153,7 @@
(#xAA ?\u040A) ;; CYRILLIC CAPITAL LETTER NJE
(#xAB ?\u040B) ;; CYRILLIC CAPITAL LETTER TSHE
(#xAC ?\u040C) ;; CYRILLIC CAPITAL LETTER KJE
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
(#xAE ?\u040E) ;; CYRILLIC CAPITAL LETTER SHORT U
(#xAF ?\u040F) ;; CYRILLIC CAPITAL LETTER DZHE
(#xB0 ?\u0410) ;; CYRILLIC CAPITAL LETTER A
@@ -205,7 +239,7 @@
"ISO-8859-5 (Cyrillic)"
'(mnemonic "ISO8/Cyr"
documentation "The ISO standard for encoding Cyrillic. Not used in practice.
-See `koi8-r' and `windows-1250'. "
+See `koi8-r' and `windows-1251'. "
aliases (cyrillic-iso-8bit)))
;; Provide this locale; but don't allow it to be picked up from the Unix
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/mule/greek.el
--- a/lisp/mule/greek.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/mule/greek.el Sat Feb 07 17:13:37 2009 +0000
@@ -120,19 +120,67 @@
(make-8-bit-coding-system
'iso-8859-7
- '((#xA1 ?\u2018) ;; LEFT SINGLE QUOTATION MARK
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u2018) ;; LEFT SINGLE QUOTATION MARK
(#xA2 ?\u2019) ;; RIGHT SINGLE QUOTATION MARK
+ (#xA3 ?\u00A3) ;; POUND SIGN
(#xA4 ?\u20AC) ;; EURO SIGN
(#xA5 ?\u20AF) ;; DRACHMA SIGN
+ (#xA6 ?\u00A6) ;; BROKEN BAR
+ (#xA7 ?\u00A7) ;; SECTION SIGN
+ (#xA8 ?\u00A8) ;; DIAERESIS
+ (#xA9 ?\u00A9) ;; COPYRIGHT SIGN
(#xAA ?\u037A) ;; GREEK YPOGEGRAMMENI
+ (#xAB ?\u00AB) ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xAC ?\u00AC) ;; NOT SIGN
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
(#xAF ?\u2015) ;; HORIZONTAL BAR
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
+ (#xB1 ?\u00B1) ;; PLUS-MINUS SIGN
+ (#xB2 ?\u00B2) ;; SUPERSCRIPT TWO
+ (#xB3 ?\u00B3) ;; SUPERSCRIPT THREE
(#xB4 ?\u0384) ;; GREEK TONOS
(#xB5 ?\u0385) ;; GREEK DIALYTIKA TONOS
(#xB6 ?\u0386) ;; GREEK CAPITAL LETTER ALPHA WITH TONOS
+ (#xB7 ?\u00B7) ;; MIDDLE DOT
(#xB8 ?\u0388) ;; GREEK CAPITAL LETTER EPSILON WITH TONOS
(#xB9 ?\u0389) ;; GREEK CAPITAL LETTER ETA WITH TONOS
(#xBA ?\u038A) ;; GREEK CAPITAL LETTER IOTA WITH TONOS
+ (#xBB ?\u00BB) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
(#xBC ?\u038C) ;; GREEK CAPITAL LETTER OMICRON WITH TONOS
+ (#xBD ?\u00BD) ;; VULGAR FRACTION ONE HALF
(#xBE ?\u038E) ;; GREEK CAPITAL LETTER UPSILON WITH TONOS
(#xBF ?\u038F) ;; GREEK CAPITAL LETTER OMEGA WITH TONOS
(#xC0 ?\u0390) ;; GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
@@ -196,7 +244,7 @@
(#xFB ?\u03CB) ;; GREEK SMALL LETTER UPSILON WITH DIALYTIKA
(#xFC ?\u03CC) ;; GREEK SMALL LETTER OMICRON WITH TONOS
(#xFD ?\u03CD) ;; GREEK SMALL LETTER UPSILON WITH TONOS
- (#xFE ?\u03CE)) ;; GREEK SMALL LETTER OMEGA WITH TONOS
+ (#xFE ?\u03CE));; GREEK SMALL LETTER OMEGA WITH TONOS
"ISO-8859-7 (Greek)"
'(mnemonic "Grk"
aliases (greek-iso-8bit)))
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/mule/hebrew.el
--- a/lisp/mule/hebrew.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/mule/hebrew.el Sat Feb 07 17:13:37 2009 +0000
@@ -50,8 +50,68 @@
(make-8-bit-coding-system
'iso-8859-8
- '((#xAA ?\u00D7) ;; MULTIPLICATION SIGN
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA2 ?\u00A2) ;; CENT SIGN
+ (#xA3 ?\u00A3) ;; POUND SIGN
+ (#xA4 ?\u00A4) ;; CURRENCY SIGN
+ (#xA5 ?\u00A5) ;; YEN SIGN
+ (#xA6 ?\u00A6) ;; BROKEN BAR
+ (#xA7 ?\u00A7) ;; SECTION SIGN
+ (#xA8 ?\u00A8) ;; DIAERESIS
+ (#xA9 ?\u00A9) ;; COPYRIGHT SIGN
+ (#xAA ?\u00D7) ;; MULTIPLICATION SIGN
+ (#xAB ?\u00AB) ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xAC ?\u00AC) ;; NOT SIGN
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
+ (#xAE ?\u00AE) ;; REGISTERED SIGN
+ (#xAF ?\u00AF) ;; MACRON
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
+ (#xB1 ?\u00B1) ;; PLUS-MINUS SIGN
+ (#xB2 ?\u00B2) ;; SUPERSCRIPT TWO
+ (#xB3 ?\u00B3) ;; SUPERSCRIPT THREE
+ (#xB4 ?\u00B4) ;; ACUTE ACCENT
+ (#xB5 ?\u00B5) ;; MICRO SIGN
+ (#xB6 ?\u00B6) ;; PILCROW SIGN
+ (#xB7 ?\u00B7) ;; MIDDLE DOT
+ (#xB8 ?\u00B8) ;; CEDILLA
+ (#xB9 ?\u00B9) ;; SUPERSCRIPT ONE
(#xBA ?\u00F7) ;; DIVISION SIGN
+ (#xBB ?\u00BB) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xBC ?\u00BC) ;; VULGAR FRACTION ONE QUARTER
+ (#xBD ?\u00BD) ;; VULGAR FRACTION ONE HALF
+ (#xBE ?\u00BE) ;; VULGAR FRACTION THREE QUARTERS
(#xDF ?\u2017) ;; DOUBLE LOW LINE
(#xE0 ?\u05D0) ;; HEBREW LETTER ALEF
(#xE1 ?\u05D1) ;; HEBREW LETTER BET
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/mule/latin.el
--- a/lisp/mule/latin.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/mule/latin.el Sat Feb 07 17:13:37 2009 +0000
@@ -126,23 +126,63 @@
(make-8-bit-coding-system
'iso-8859-2
- '((#xA1 ?\u0104) ;; LATIN CAPITAL LETTER A WITH OGONEK
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u0104) ;; LATIN CAPITAL LETTER A WITH OGONEK
(#xA2 ?\u02D8) ;; BREVE
(#xA3 ?\u0141) ;; LATIN CAPITAL LETTER L WITH STROKE
+ (#xA4 ?\u00A4) ;; CURRENCY SIGN
(#xA5 ?\u013D) ;; LATIN CAPITAL LETTER L WITH CARON
(#xA6 ?\u015A) ;; LATIN CAPITAL LETTER S WITH ACUTE
+ (#xA7 ?\u00A7) ;; SECTION SIGN
+ (#xA8 ?\u00A8) ;; DIAERESIS
(#xA9 ?\u0160) ;; LATIN CAPITAL LETTER S WITH CARON
(#xAA ?\u015E) ;; LATIN CAPITAL LETTER S WITH CEDILLA
(#xAB ?\u0164) ;; LATIN CAPITAL LETTER T WITH CARON
(#xAC ?\u0179) ;; LATIN CAPITAL LETTER Z WITH ACUTE
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
(#xAE ?\u017D) ;; LATIN CAPITAL LETTER Z WITH CARON
(#xAF ?\u017B) ;; LATIN CAPITAL LETTER Z WITH DOT ABOVE
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
(#xB1 ?\u0105) ;; LATIN SMALL LETTER A WITH OGONEK
(#xB2 ?\u02DB) ;; OGONEK
(#xB3 ?\u0142) ;; LATIN SMALL LETTER L WITH STROKE
+ (#xB4 ?\u00B4) ;; ACUTE ACCENT
(#xB5 ?\u013E) ;; LATIN SMALL LETTER L WITH CARON
(#xB6 ?\u015B) ;; LATIN SMALL LETTER S WITH ACUTE
(#xB7 ?\u02C7) ;; CARON
+ (#xB8 ?\u00B8) ;; CEDILLA
(#xB9 ?\u0161) ;; LATIN SMALL LETTER S WITH CARON
(#xBA ?\u015F) ;; LATIN SMALL LETTER S WITH CEDILLA
(#xBB ?\u0165) ;; LATIN SMALL LETTER T WITH CARON
@@ -151,39 +191,70 @@
(#xBE ?\u017E) ;; LATIN SMALL LETTER Z WITH CARON
(#xBF ?\u017C) ;; LATIN SMALL LETTER Z WITH DOT ABOVE
(#xC0 ?\u0154) ;; LATIN CAPITAL LETTER R WITH ACUTE
+ (#xC1 ?\u00C1) ;; LATIN CAPITAL LETTER A WITH ACUTE
+ (#xC2 ?\u00C2) ;; LATIN CAPITAL LETTER A WITH CIRCUMFLEX
(#xC3 ?\u0102) ;; LATIN CAPITAL LETTER A WITH BREVE
+ (#xC4 ?\u00C4) ;; LATIN CAPITAL LETTER A WITH DIAERESIS
(#xC5 ?\u0139) ;; LATIN CAPITAL LETTER L WITH ACUTE
(#xC6 ?\u0106) ;; LATIN CAPITAL LETTER C WITH ACUTE
+ (#xC7 ?\u00C7) ;; LATIN CAPITAL LETTER C WITH CEDILLA
(#xC8 ?\u010C) ;; LATIN CAPITAL LETTER C WITH CARON
+ (#xC9 ?\u00C9) ;; LATIN CAPITAL LETTER E WITH ACUTE
(#xCA ?\u0118) ;; LATIN CAPITAL LETTER E WITH OGONEK
+ (#xCB ?\u00CB) ;; LATIN CAPITAL LETTER E WITH DIAERESIS
(#xCC ?\u011A) ;; LATIN CAPITAL LETTER E WITH CARON
+ (#xCD ?\u00CD) ;; LATIN CAPITAL LETTER I WITH ACUTE
+ (#xCE ?\u00CE) ;; LATIN CAPITAL LETTER I WITH CIRCUMFLEX
(#xCF ?\u010E) ;; LATIN CAPITAL LETTER D WITH CARON
(#xD0 ?\u0110) ;; LATIN CAPITAL LETTER D WITH STROKE
(#xD1 ?\u0143) ;; LATIN CAPITAL LETTER N WITH ACUTE
(#xD2 ?\u0147) ;; LATIN CAPITAL LETTER N WITH CARON
+ (#xD3 ?\u00D3) ;; LATIN CAPITAL LETTER O WITH ACUTE
+ (#xD4 ?\u00D4) ;; LATIN CAPITAL LETTER O WITH CIRCUMFLEX
(#xD5 ?\u0150) ;; LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
+ (#xD6 ?\u00D6) ;; LATIN CAPITAL LETTER O WITH DIAERESIS
+ (#xD7 ?\u00D7) ;; MULTIPLICATION SIGN
(#xD8 ?\u0158) ;; LATIN CAPITAL LETTER R WITH CARON
(#xD9 ?\u016E) ;; LATIN CAPITAL LETTER U WITH RING ABOVE
+ (#xDA ?\u00DA) ;; LATIN CAPITAL LETTER U WITH ACUTE
(#xDB ?\u0170) ;; LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
+ (#xDC ?\u00DC) ;; LATIN CAPITAL LETTER U WITH DIAERESIS
+ (#xDD ?\u00DD) ;; LATIN CAPITAL LETTER Y WITH ACUTE
(#xDE ?\u0162) ;; LATIN CAPITAL LETTER T WITH CEDILLA
+ (#xDF ?\u00DF) ;; LATIN SMALL LETTER SHARP S
(#xE0 ?\u0155) ;; LATIN SMALL LETTER R WITH ACUTE
+ (#xE1 ?\u00E1) ;; LATIN SMALL LETTER A WITH ACUTE
+ (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX
(#xE3 ?\u0103) ;; LATIN SMALL LETTER A WITH BREVE
+ (#xE4 ?\u00E4) ;; LATIN SMALL LETTER A WITH DIAERESIS
(#xE5 ?\u013A) ;; LATIN SMALL LETTER L WITH ACUTE
(#xE6 ?\u0107) ;; LATIN SMALL LETTER C WITH ACUTE
+ (#xE7 ?\u00E7) ;; LATIN SMALL LETTER C WITH CEDILLA
(#xE8 ?\u010D) ;; LATIN SMALL LETTER C WITH CARON
+ (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE
(#xEA ?\u0119) ;; LATIN SMALL LETTER E WITH OGONEK
+ (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS
(#xEC ?\u011B) ;; LATIN SMALL LETTER E WITH CARON
+ (#xED ?\u00ED) ;; LATIN SMALL LETTER I WITH ACUTE
+ (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX
(#xEF ?\u010F) ;; LATIN SMALL LETTER D WITH CARON
(#xF0 ?\u0111) ;; LATIN SMALL LETTER D WITH STROKE
(#xF1 ?\u0144) ;; LATIN SMALL LETTER N WITH ACUTE
(#xF2 ?\u0148) ;; LATIN SMALL LETTER N WITH CARON
+ (#xF3 ?\u00F3) ;; LATIN SMALL LETTER O WITH ACUTE
+ (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX
(#xF5 ?\u0151) ;; LATIN SMALL LETTER O WITH DOUBLE ACUTE
+ (#xF6 ?\u00F6) ;; LATIN SMALL LETTER O WITH DIAERESIS
+ (#xF7 ?\u00F7) ;; DIVISION SIGN
(#xF8 ?\u0159) ;; LATIN SMALL LETTER R WITH CARON
(#xF9 ?\u016F) ;; LATIN SMALL LETTER U WITH RING ABOVE
+ (#xFA ?\u00FA) ;; LATIN SMALL LETTER U WITH ACUTE
(#xFB ?\u0171) ;; LATIN SMALL LETTER U WITH DOUBLE ACUTE
+ (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS
+ (#xFD ?\u00FD) ;; LATIN SMALL LETTER Y WITH ACUTE
(#xFE ?\u0163) ;; LATIN SMALL LETTER T WITH CEDILLA
- (#xFF ?\u02D9));; DOT ABOVE
- "ISO-8859-2 (Latin-2) for Central Europe.
+ (#xFF ?\u02D9)) ;; DOT ABOVE
+ "ISO-8859-2 (Latin-2) for Central Europe.
See also `windows-1250', and `iso-8859-1', which is compatible with Latin 2
when used to write German (or English, of course). "
'(mnemonic "Latin 2"
@@ -391,31 +462,124 @@ See also `iso-8859-2' and `window-1252'
(make-8-bit-coding-system
'iso-8859-3
- '((#xA1 ?\u0126) ;; LATIN CAPITAL LETTER H WITH STROKE
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u0126) ;; LATIN CAPITAL LETTER H WITH STROKE
(#xA2 ?\u02D8) ;; BREVE
+ (#xA3 ?\u00A3) ;; POUND SIGN
+ (#xA4 ?\u00A4) ;; CURRENCY SIGN
(#xA6 ?\u0124) ;; LATIN CAPITAL LETTER H WITH CIRCUMFLEX
+ (#xA7 ?\u00A7) ;; SECTION SIGN
+ (#xA8 ?\u00A8) ;; DIAERESIS
(#xA9 ?\u0130) ;; LATIN CAPITAL LETTER I WITH DOT ABOVE
(#xAA ?\u015E) ;; LATIN CAPITAL LETTER S WITH CEDILLA
(#xAB ?\u011E) ;; LATIN CAPITAL LETTER G WITH BREVE
(#xAC ?\u0134) ;; LATIN CAPITAL LETTER J WITH CIRCUMFLEX
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
(#xAF ?\u017B) ;; LATIN CAPITAL LETTER Z WITH DOT ABOVE
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
(#xB1 ?\u0127) ;; LATIN SMALL LETTER H WITH STROKE
+ (#xB2 ?\u00B2) ;; SUPERSCRIPT TWO
+ (#xB3 ?\u00B3) ;; SUPERSCRIPT THREE
+ (#xB4 ?\u00B4) ;; ACUTE ACCENT
+ (#xB5 ?\u00B5) ;; MICRO SIGN
(#xB6 ?\u0125) ;; LATIN SMALL LETTER H WITH CIRCUMFLEX
+ (#xB7 ?\u00B7) ;; MIDDLE DOT
+ (#xB8 ?\u00B8) ;; CEDILLA
(#xB9 ?\u0131) ;; LATIN SMALL LETTER DOTLESS I
(#xBA ?\u015F) ;; LATIN SMALL LETTER S WITH CEDILLA
(#xBB ?\u011F) ;; LATIN SMALL LETTER G WITH BREVE
(#xBC ?\u0135) ;; LATIN SMALL LETTER J WITH CIRCUMFLEX
+ (#xBD ?\u00BD) ;; VULGAR FRACTION ONE HALF
(#xBF ?\u017C) ;; LATIN SMALL LETTER Z WITH DOT ABOVE
+ (#xC0 ?\u00C0) ;; LATIN CAPITAL LETTER A WITH GRAVE
+ (#xC1 ?\u00C1) ;; LATIN CAPITAL LETTER A WITH ACUTE
+ (#xC2 ?\u00C2) ;; LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+ (#xC4 ?\u00C4) ;; LATIN CAPITAL LETTER A WITH DIAERESIS
(#xC5 ?\u010A) ;; LATIN CAPITAL LETTER C WITH DOT ABOVE
(#xC6 ?\u0108) ;; LATIN CAPITAL LETTER C WITH CIRCUMFLEX
+ (#xC7 ?\u00C7) ;; LATIN CAPITAL LETTER C WITH CEDILLA
+ (#xC8 ?\u00C8) ;; LATIN CAPITAL LETTER E WITH GRAVE
+ (#xC9 ?\u00C9) ;; LATIN CAPITAL LETTER E WITH ACUTE
+ (#xCA ?\u00CA) ;; LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+ (#xCB ?\u00CB) ;; LATIN CAPITAL LETTER E WITH DIAERESIS
+ (#xCC ?\u00CC) ;; LATIN CAPITAL LETTER I WITH GRAVE
+ (#xCD ?\u00CD) ;; LATIN CAPITAL LETTER I WITH ACUTE
+ (#xCE ?\u00CE) ;; LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+ (#xCF ?\u00CF) ;; LATIN CAPITAL LETTER I WITH DIAERESIS
+ (#xD1 ?\u00D1) ;; LATIN CAPITAL LETTER N WITH TILDE
+ (#xD2 ?\u00D2) ;; LATIN CAPITAL LETTER O WITH GRAVE
+ (#xD3 ?\u00D3) ;; LATIN CAPITAL LETTER O WITH ACUTE
+ (#xD4 ?\u00D4) ;; LATIN CAPITAL LETTER O WITH CIRCUMFLEX
(#xD5 ?\u0120) ;; LATIN CAPITAL LETTER G WITH DOT ABOVE
+ (#xD6 ?\u00D6) ;; LATIN CAPITAL LETTER O WITH DIAERESIS
+ (#xD7 ?\u00D7) ;; MULTIPLICATION SIGN
(#xD8 ?\u011C) ;; LATIN CAPITAL LETTER G WITH CIRCUMFLEX
+ (#xD9 ?\u00D9) ;; LATIN CAPITAL LETTER U WITH GRAVE
+ (#xDA ?\u00DA) ;; LATIN CAPITAL LETTER U WITH ACUTE
+ (#xDB ?\u00DB) ;; LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+ (#xDC ?\u00DC) ;; LATIN CAPITAL LETTER U WITH DIAERESIS
(#xDD ?\u016C) ;; LATIN CAPITAL LETTER U WITH BREVE
(#xDE ?\u015C) ;; LATIN CAPITAL LETTER S WITH CIRCUMFLEX
+ (#xDF ?\u00DF) ;; LATIN SMALL LETTER SHARP S
+ (#xE0 ?\u00E0) ;; LATIN SMALL LETTER A WITH GRAVE
+ (#xE1 ?\u00E1) ;; LATIN SMALL LETTER A WITH ACUTE
+ (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX
+ (#xE4 ?\u00E4) ;; LATIN SMALL LETTER A WITH DIAERESIS
(#xE5 ?\u010B) ;; LATIN SMALL LETTER C WITH DOT ABOVE
(#xE6 ?\u0109) ;; LATIN SMALL LETTER C WITH CIRCUMFLEX
+ (#xE7 ?\u00E7) ;; LATIN SMALL LETTER C WITH CEDILLA
+ (#xE8 ?\u00E8) ;; LATIN SMALL LETTER E WITH GRAVE
+ (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE
+ (#xEA ?\u00EA) ;; LATIN SMALL LETTER E WITH CIRCUMFLEX
+ (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS
+ (#xEC ?\u00EC) ;; LATIN SMALL LETTER I WITH GRAVE
+ (#xED ?\u00ED) ;; LATIN SMALL LETTER I WITH ACUTE
+ (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX
+ (#xEF ?\u00EF) ;; LATIN SMALL LETTER I WITH DIAERESIS
+ (#xF1 ?\u00F1) ;; LATIN SMALL LETTER N WITH TILDE
+ (#xF2 ?\u00F2) ;; LATIN SMALL LETTER O WITH GRAVE
+ (#xF3 ?\u00F3) ;; LATIN SMALL LETTER O WITH ACUTE
+ (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX
(#xF5 ?\u0121) ;; LATIN SMALL LETTER G WITH DOT ABOVE
+ (#xF6 ?\u00F6) ;; LATIN SMALL LETTER O WITH DIAERESIS
+ (#xF7 ?\u00F7) ;; DIVISION SIGN
(#xF8 ?\u011D) ;; LATIN SMALL LETTER G WITH CIRCUMFLEX
+ (#xF9 ?\u00F9) ;; LATIN SMALL LETTER U WITH GRAVE
+ (#xFA ?\u00FA) ;; LATIN SMALL LETTER U WITH ACUTE
+ (#xFB ?\u00FB) ;; LATIN SMALL LETTER U WITH CIRCUMFLEX
+ (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS
(#xFD ?\u016D) ;; LATIN SMALL LETTER U WITH BREVE
(#xFE ?\u015D) ;; LATIN SMALL LETTER S WITH CIRCUMFLEX
(#xFF ?\u02D9)) ;; DOT ABOVE
@@ -498,22 +662,63 @@ See also `iso-8859-2' and `window-1252'
(make-8-bit-coding-system
'iso-8859-4
- '((#xA1 ?\u0104) ;; LATIN CAPITAL LETTER A WITH OGONEK
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u0104) ;; LATIN CAPITAL LETTER A WITH OGONEK
(#xA2 ?\u0138) ;; LATIN SMALL LETTER KRA
(#xA3 ?\u0156) ;; LATIN CAPITAL LETTER R WITH CEDILLA
+ (#xA4 ?\u00A4) ;; CURRENCY SIGN
(#xA5 ?\u0128) ;; LATIN CAPITAL LETTER I WITH TILDE
(#xA6 ?\u013B) ;; LATIN CAPITAL LETTER L WITH CEDILLA
+ (#xA7 ?\u00A7) ;; SECTION SIGN
+ (#xA8 ?\u00A8) ;; DIAERESIS
(#xA9 ?\u0160) ;; LATIN CAPITAL LETTER S WITH CARON
(#xAA ?\u0112) ;; LATIN CAPITAL LETTER E WITH MACRON
(#xAB ?\u0122) ;; LATIN CAPITAL LETTER G WITH CEDILLA
(#xAC ?\u0166) ;; LATIN CAPITAL LETTER T WITH STROKE
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
(#xAE ?\u017D) ;; LATIN CAPITAL LETTER Z WITH CARON
+ (#xAF ?\u00AF) ;; MACRON
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
(#xB1 ?\u0105) ;; LATIN SMALL LETTER A WITH OGONEK
(#xB2 ?\u02DB) ;; OGONEK
(#xB3 ?\u0157) ;; LATIN SMALL LETTER R WITH CEDILLA
+ (#xB4 ?\u00B4) ;; ACUTE ACCENT
(#xB5 ?\u0129) ;; LATIN SMALL LETTER I WITH TILDE
(#xB6 ?\u013C) ;; LATIN SMALL LETTER L WITH CEDILLA
(#xB7 ?\u02C7) ;; CARON
+ (#xB8 ?\u00B8) ;; CEDILLA
(#xB9 ?\u0161) ;; LATIN SMALL LETTER S WITH CARON
(#xBA ?\u0113) ;; LATIN SMALL LETTER E WITH MACRON
(#xBB ?\u0123) ;; LATIN SMALL LETTER G WITH CEDILLA
@@ -522,29 +727,66 @@ See also `iso-8859-2' and `window-1252'
(#xBE ?\u017E) ;; LATIN SMALL LETTER Z WITH CARON
(#xBF ?\u014B) ;; LATIN SMALL LETTER ENG
(#xC0 ?\u0100) ;; LATIN CAPITAL LETTER A WITH MACRON
+ (#xC1 ?\u00C1) ;; LATIN CAPITAL LETTER A WITH ACUTE
+ (#xC2 ?\u00C2) ;; LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+ (#xC3 ?\u00C3) ;; LATIN CAPITAL LETTER A WITH TILDE
+ (#xC4 ?\u00C4) ;; LATIN CAPITAL LETTER A WITH DIAERESIS
+ (#xC5 ?\u00C5) ;; LATIN CAPITAL LETTER A WITH RING ABOVE
+ (#xC6 ?\u00C6) ;; LATIN CAPITAL LETTER AE
(#xC7 ?\u012E) ;; LATIN CAPITAL LETTER I WITH OGONEK
(#xC8 ?\u010C) ;; LATIN CAPITAL LETTER C WITH CARON
+ (#xC9 ?\u00C9) ;; LATIN CAPITAL LETTER E WITH ACUTE
(#xCA ?\u0118) ;; LATIN CAPITAL LETTER E WITH OGONEK
+ (#xCB ?\u00CB) ;; LATIN CAPITAL LETTER E WITH DIAERESIS
(#xCC ?\u0116) ;; LATIN CAPITAL LETTER E WITH DOT ABOVE
+ (#xCD ?\u00CD) ;; LATIN CAPITAL LETTER I WITH ACUTE
+ (#xCE ?\u00CE) ;; LATIN CAPITAL LETTER I WITH CIRCUMFLEX
(#xCF ?\u012A) ;; LATIN CAPITAL LETTER I WITH MACRON
(#xD0 ?\u0110) ;; LATIN CAPITAL LETTER D WITH STROKE
(#xD1 ?\u0145) ;; LATIN CAPITAL LETTER N WITH CEDILLA
(#xD2 ?\u014C) ;; LATIN CAPITAL LETTER O WITH MACRON
(#xD3 ?\u0136) ;; LATIN CAPITAL LETTER K WITH CEDILLA
+ (#xD4 ?\u00D4) ;; LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+ (#xD5 ?\u00D5) ;; LATIN CAPITAL LETTER O WITH TILDE
+ (#xD6 ?\u00D6) ;; LATIN CAPITAL LETTER O WITH DIAERESIS
+ (#xD7 ?\u00D7) ;; MULTIPLICATION SIGN
+ (#xD8 ?\u00D8) ;; LATIN CAPITAL LETTER O WITH STROKE
(#xD9 ?\u0172) ;; LATIN CAPITAL LETTER U WITH OGONEK
+ (#xDA ?\u00DA) ;; LATIN CAPITAL LETTER U WITH ACUTE
+ (#xDB ?\u00DB) ;; LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+ (#xDC ?\u00DC) ;; LATIN CAPITAL LETTER U WITH DIAERESIS
(#xDD ?\u0168) ;; LATIN CAPITAL LETTER U WITH TILDE
(#xDE ?\u016A) ;; LATIN CAPITAL LETTER U WITH MACRON
+ (#xDF ?\u00DF) ;; LATIN SMALL LETTER SHARP S
(#xE0 ?\u0101) ;; LATIN SMALL LETTER A WITH MACRON
+ (#xE1 ?\u00E1) ;; LATIN SMALL LETTER A WITH ACUTE
+ (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX
+ (#xE3 ?\u00E3) ;; LATIN SMALL LETTER A WITH TILDE
+ (#xE4 ?\u00E4) ;; LATIN SMALL LETTER A WITH DIAERESIS
+ (#xE5 ?\u00E5) ;; LATIN SMALL LETTER A WITH RING ABOVE
+ (#xE6 ?\u00E6) ;; LATIN SMALL LETTER AE
(#xE7 ?\u012F) ;; LATIN SMALL LETTER I WITH OGONEK
(#xE8 ?\u010D) ;; LATIN SMALL LETTER C WITH CARON
+ (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE
(#xEA ?\u0119) ;; LATIN SMALL LETTER E WITH OGONEK
+ (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS
(#xEC ?\u0117) ;; LATIN SMALL LETTER E WITH DOT ABOVE
+ (#xED ?\u00ED) ;; LATIN SMALL LETTER I WITH ACUTE
+ (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX
(#xEF ?\u012B) ;; LATIN SMALL LETTER I WITH MACRON
(#xF0 ?\u0111) ;; LATIN SMALL LETTER D WITH STROKE
(#xF1 ?\u0146) ;; LATIN SMALL LETTER N WITH CEDILLA
(#xF2 ?\u014D) ;; LATIN SMALL LETTER O WITH MACRON
(#xF3 ?\u0137) ;; LATIN SMALL LETTER K WITH CEDILLA
+ (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX
+ (#xF5 ?\u00F5) ;; LATIN SMALL LETTER O WITH TILDE
+ (#xF6 ?\u00F6) ;; LATIN SMALL LETTER O WITH DIAERESIS
+ (#xF7 ?\u00F7) ;; DIVISION SIGN
+ (#xF8 ?\u00F8) ;; LATIN SMALL LETTER O WITH STROKE
(#xF9 ?\u0173) ;; LATIN SMALL LETTER U WITH OGONEK
+ (#xFA ?\u00FA) ;; LATIN SMALL LETTER U WITH ACUTE
+ (#xFB ?\u00FB) ;; LATIN SMALL LETTER U WITH CIRCUMFLEX
+ (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS
(#xFD ?\u0169) ;; LATIN SMALL LETTER U WITH TILDE
(#xFE ?\u016B) ;; LATIN SMALL LETTER U WITH MACRON
(#xFF ?\u02D9));; DOT ABOVE
@@ -633,15 +875,53 @@ See also `iso-8859-2' and `window-1252'
(make-8-bit-coding-system
'iso-8859-14
- '((#xA1 ?\u1E02) ;; LATIN CAPITAL LETTER B WITH DOT ABOVE
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u1E02) ;; LATIN CAPITAL LETTER B WITH DOT ABOVE
(#xA2 ?\u1E03) ;; LATIN SMALL LETTER B WITH DOT ABOVE
+ (#xA3 ?\u00A3) ;; POUND SIGN
(#xA4 ?\u010A) ;; LATIN CAPITAL LETTER C WITH DOT ABOVE
(#xA5 ?\u010B) ;; LATIN SMALL LETTER C WITH DOT ABOVE
(#xA6 ?\u1E0A) ;; LATIN CAPITAL LETTER D WITH DOT ABOVE
+ (#xA7 ?\u00A7) ;; SECTION SIGN
(#xA8 ?\u1E80) ;; LATIN CAPITAL LETTER W WITH GRAVE
+ (#xA9 ?\u00A9) ;; COPYRIGHT SIGN
(#xAA ?\u1E82) ;; LATIN CAPITAL LETTER W WITH ACUTE
(#xAB ?\u1E0B) ;; LATIN SMALL LETTER D WITH DOT ABOVE
(#xAC ?\u1EF2) ;; LATIN CAPITAL LETTER Y WITH GRAVE
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
+ (#xAE ?\u00AE) ;; REGISTERED SIGN
(#xAF ?\u0178) ;; LATIN CAPITAL LETTER Y WITH DIAERESIS
(#xB0 ?\u1E1E) ;; LATIN CAPITAL LETTER F WITH DOT ABOVE
(#xB1 ?\u1E1F) ;; LATIN SMALL LETTER F WITH DOT ABOVE
@@ -649,6 +929,7 @@ See also `iso-8859-2' and `window-1252'
(#xB3 ?\u0121) ;; LATIN SMALL LETTER G WITH DOT ABOVE
(#xB4 ?\u1E40) ;; LATIN CAPITAL LETTER M WITH DOT ABOVE
(#xB5 ?\u1E41) ;; LATIN SMALL LETTER M WITH DOT ABOVE
+ (#xB6 ?\u00B6) ;; PILCROW SIGN
(#xB7 ?\u1E56) ;; LATIN CAPITAL LETTER P WITH DOT ABOVE
(#xB8 ?\u1E81) ;; LATIN SMALL LETTER W WITH GRAVE
(#xB9 ?\u1E57) ;; LATIN SMALL LETTER P WITH DOT ABOVE
@@ -658,12 +939,70 @@ See also `iso-8859-2' and `window-1252'
(#xBD ?\u1E84) ;; LATIN CAPITAL LETTER W WITH DIAERESIS
(#xBE ?\u1E85) ;; LATIN SMALL LETTER W WITH DIAERESIS
(#xBF ?\u1E61) ;; LATIN SMALL LETTER S WITH DOT ABOVE
+ (#xC0 ?\u00C0) ;; LATIN CAPITAL LETTER A WITH GRAVE
+ (#xC1 ?\u00C1) ;; LATIN CAPITAL LETTER A WITH ACUTE
+ (#xC2 ?\u00C2) ;; LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+ (#xC3 ?\u00C3) ;; LATIN CAPITAL LETTER A WITH TILDE
+ (#xC4 ?\u00C4) ;; LATIN CAPITAL LETTER A WITH DIAERESIS
+ (#xC5 ?\u00C5) ;; LATIN CAPITAL LETTER A WITH RING ABOVE
+ (#xC6 ?\u00C6) ;; LATIN CAPITAL LETTER AE
+ (#xC7 ?\u00C7) ;; LATIN CAPITAL LETTER C WITH CEDILLA
+ (#xC8 ?\u00C8) ;; LATIN CAPITAL LETTER E WITH GRAVE
+ (#xC9 ?\u00C9) ;; LATIN CAPITAL LETTER E WITH ACUTE
+ (#xCA ?\u00CA) ;; LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+ (#xCB ?\u00CB) ;; LATIN CAPITAL LETTER E WITH DIAERESIS
+ (#xCC ?\u00CC) ;; LATIN CAPITAL LETTER I WITH GRAVE
+ (#xCD ?\u00CD) ;; LATIN CAPITAL LETTER I WITH ACUTE
+ (#xCE ?\u00CE) ;; LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+ (#xCF ?\u00CF) ;; LATIN CAPITAL LETTER I WITH DIAERESIS
(#xD0 ?\u0174) ;; LATIN CAPITAL LETTER W WITH CIRCUMFLEX
+ (#xD1 ?\u00D1) ;; LATIN CAPITAL LETTER N WITH TILDE
+ (#xD2 ?\u00D2) ;; LATIN CAPITAL LETTER O WITH GRAVE
+ (#xD3 ?\u00D3) ;; LATIN CAPITAL LETTER O WITH ACUTE
+ (#xD4 ?\u00D4) ;; LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+ (#xD5 ?\u00D5) ;; LATIN CAPITAL LETTER O WITH TILDE
+ (#xD6 ?\u00D6) ;; LATIN CAPITAL LETTER O WITH DIAERESIS
(#xD7 ?\u1E6A) ;; LATIN CAPITAL LETTER T WITH DOT ABOVE
+ (#xD8 ?\u00D8) ;; LATIN CAPITAL LETTER O WITH STROKE
+ (#xD9 ?\u00D9) ;; LATIN CAPITAL LETTER U WITH GRAVE
+ (#xDA ?\u00DA) ;; LATIN CAPITAL LETTER U WITH ACUTE
+ (#xDB ?\u00DB) ;; LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+ (#xDC ?\u00DC) ;; LATIN CAPITAL LETTER U WITH DIAERESIS
+ (#xDD ?\u00DD) ;; LATIN CAPITAL LETTER Y WITH ACUTE
(#xDE ?\u0176) ;; LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
+ (#xDF ?\u00DF) ;; LATIN SMALL LETTER SHARP S
+ (#xE0 ?\u00E0) ;; LATIN SMALL LETTER A WITH GRAVE
+ (#xE1 ?\u00E1) ;; LATIN SMALL LETTER A WITH ACUTE
+ (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX
+ (#xE3 ?\u00E3) ;; LATIN SMALL LETTER A WITH TILDE
+ (#xE4 ?\u00E4) ;; LATIN SMALL LETTER A WITH DIAERESIS
+ (#xE5 ?\u00E5) ;; LATIN SMALL LETTER A WITH RING ABOVE
+ (#xE6 ?\u00E6) ;; LATIN SMALL LETTER AE
+ (#xE7 ?\u00E7) ;; LATIN SMALL LETTER C WITH CEDILLA
+ (#xE8 ?\u00E8) ;; LATIN SMALL LETTER E WITH GRAVE
+ (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE
+ (#xEA ?\u00EA) ;; LATIN SMALL LETTER E WITH CIRCUMFLEX
+ (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS
+ (#xEC ?\u00EC) ;; LATIN SMALL LETTER I WITH GRAVE
+ (#xED ?\u00ED) ;; LATIN SMALL LETTER I WITH ACUTE
+ (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX
+ (#xEF ?\u00EF) ;; LATIN SMALL LETTER I WITH DIAERESIS
(#xF0 ?\u0175) ;; LATIN SMALL LETTER W WITH CIRCUMFLEX
+ (#xF1 ?\u00F1) ;; LATIN SMALL LETTER N WITH TILDE
+ (#xF2 ?\u00F2) ;; LATIN SMALL LETTER O WITH GRAVE
+ (#xF3 ?\u00F3) ;; LATIN SMALL LETTER O WITH ACUTE
+ (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX
+ (#xF5 ?\u00F5) ;; LATIN SMALL LETTER O WITH TILDE
+ (#xF6 ?\u00F6) ;; LATIN SMALL LETTER O WITH DIAERESIS
(#xF7 ?\u1E6B) ;; LATIN SMALL LETTER T WITH DOT ABOVE
- (#xFE ?\u0177)) ;; LATIN SMALL LETTER Y WITH CIRCUMFLEX
+ (#xF8 ?\u00F8) ;; LATIN SMALL LETTER O WITH STROKE
+ (#xF9 ?\u00F9) ;; LATIN SMALL LETTER U WITH GRAVE
+ (#xFA ?\u00FA) ;; LATIN SMALL LETTER U WITH ACUTE
+ (#xFB ?\u00FB) ;; LATIN SMALL LETTER U WITH CIRCUMFLEX
+ (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS
+ (#xFD ?\u00FD) ;; LATIN SMALL LETTER Y WITH ACUTE
+ (#xFE ?\u0177) ;; LATIN SMALL LETTER Y WITH CIRCUMFLEX
+ (#xFF ?\u00FF)) ;; LATIN SMALL LETTER Y WITH DIAERESIS
"ISO-8859-14 (Latin-8)"
'(mnemonic "Latin 8"
aliases (iso-latin-8 latin-8)))
@@ -742,14 +1081,134 @@ See also `iso-8859-2' and `window-1252'
(make-8-bit-coding-system
'iso-8859-15
- '((#xA4 ?\u20AC) ;; EURO SIGN
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u00A1) ;; INVERTED EXCLAMATION MARK
+ (#xA2 ?\u00A2) ;; CENT SIGN
+ (#xA3 ?\u00A3) ;; POUND SIGN
+ (#xA4 ?\u20AC) ;; EURO SIGN
+ (#xA5 ?\u00A5) ;; YEN SIGN
(#xA6 ?\u0160) ;; LATIN CAPITAL LETTER S WITH CARON
+ (#xA7 ?\u00A7) ;; SECTION SIGN
(#xA8 ?\u0161) ;; LATIN SMALL LETTER S WITH CARON
+ (#xA9 ?\u00A9) ;; COPYRIGHT SIGN
+ (#xAA ?\u00AA) ;; FEMININE ORDINAL INDICATOR
+ (#xAB ?\u00AB) ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xAC ?\u00AC) ;; NOT SIGN
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
+ (#xAE ?\u00AE) ;; REGISTERED SIGN
+ (#xAF ?\u00AF) ;; MACRON
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
+ (#xB1 ?\u00B1) ;; PLUS-MINUS SIGN
+ (#xB2 ?\u00B2) ;; SUPERSCRIPT TWO
+ (#xB3 ?\u00B3) ;; SUPERSCRIPT THREE
(#xB4 ?\u017D) ;; LATIN CAPITAL LETTER Z WITH CARON
+ (#xB5 ?\u00B5) ;; MICRO SIGN
+ (#xB6 ?\u00B6) ;; PILCROW SIGN
+ (#xB7 ?\u00B7) ;; MIDDLE DOT
(#xB8 ?\u017E) ;; LATIN SMALL LETTER Z WITH CARON
+ (#xB9 ?\u00B9) ;; SUPERSCRIPT ONE
+ (#xBA ?\u00BA) ;; MASCULINE ORDINAL INDICATOR
+ (#xBB ?\u00BB) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
(#xBC ?\u0152) ;; LATIN CAPITAL LIGATURE OE
(#xBD ?\u0153) ;; LATIN SMALL LIGATURE OE
- (#xBE ?\u0178)) ;; LATIN CAPITAL LETTER Y WITH DIAERESIS
+ (#xBE ?\u0178) ;; LATIN CAPITAL LETTER Y WITH DIAERESIS
+ (#xBF ?\u00BF) ;; INVERTED QUESTION MARK
+ (#xC0 ?\u00C0) ;; LATIN CAPITAL LETTER A WITH GRAVE
+ (#xC1 ?\u00C1) ;; LATIN CAPITAL LETTER A WITH ACUTE
+ (#xC2 ?\u00C2) ;; LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+ (#xC3 ?\u00C3) ;; LATIN CAPITAL LETTER A WITH TILDE
+ (#xC4 ?\u00C4) ;; LATIN CAPITAL LETTER A WITH DIAERESIS
+ (#xC5 ?\u00C5) ;; LATIN CAPITAL LETTER A WITH RING ABOVE
+ (#xC6 ?\u00C6) ;; LATIN CAPITAL LETTER AE
+ (#xC7 ?\u00C7) ;; LATIN CAPITAL LETTER C WITH CEDILLA
+ (#xC8 ?\u00C8) ;; LATIN CAPITAL LETTER E WITH GRAVE
+ (#xC9 ?\u00C9) ;; LATIN CAPITAL LETTER E WITH ACUTE
+ (#xCA ?\u00CA) ;; LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+ (#xCB ?\u00CB) ;; LATIN CAPITAL LETTER E WITH DIAERESIS
+ (#xCC ?\u00CC) ;; LATIN CAPITAL LETTER I WITH GRAVE
+ (#xCD ?\u00CD) ;; LATIN CAPITAL LETTER I WITH ACUTE
+ (#xCE ?\u00CE) ;; LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+ (#xCF ?\u00CF) ;; LATIN CAPITAL LETTER I WITH DIAERESIS
+ (#xD0 ?\u00D0) ;; LATIN CAPITAL LETTER ETH
+ (#xD1 ?\u00D1) ;; LATIN CAPITAL LETTER N WITH TILDE
+ (#xD2 ?\u00D2) ;; LATIN CAPITAL LETTER O WITH GRAVE
+ (#xD3 ?\u00D3) ;; LATIN CAPITAL LETTER O WITH ACUTE
+ (#xD4 ?\u00D4) ;; LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+ (#xD5 ?\u00D5) ;; LATIN CAPITAL LETTER O WITH TILDE
+ (#xD6 ?\u00D6) ;; LATIN CAPITAL LETTER O WITH DIAERESIS
+ (#xD7 ?\u00D7) ;; MULTIPLICATION SIGN
+ (#xD8 ?\u00D8) ;; LATIN CAPITAL LETTER O WITH STROKE
+ (#xD9 ?\u00D9) ;; LATIN CAPITAL LETTER U WITH GRAVE
+ (#xDA ?\u00DA) ;; LATIN CAPITAL LETTER U WITH ACUTE
+ (#xDB ?\u00DB) ;; LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+ (#xDC ?\u00DC) ;; LATIN CAPITAL LETTER U WITH DIAERESIS
+ (#xDD ?\u00DD) ;; LATIN CAPITAL LETTER Y WITH ACUTE
+ (#xDE ?\u00DE) ;; LATIN CAPITAL LETTER THORN
+ (#xDF ?\u00DF) ;; LATIN SMALL LETTER SHARP S
+ (#xE0 ?\u00E0) ;; LATIN SMALL LETTER A WITH GRAVE
+ (#xE1 ?\u00E1) ;; LATIN SMALL LETTER A WITH ACUTE
+ (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX
+ (#xE3 ?\u00E3) ;; LATIN SMALL LETTER A WITH TILDE
+ (#xE4 ?\u00E4) ;; LATIN SMALL LETTER A WITH DIAERESIS
+ (#xE5 ?\u00E5) ;; LATIN SMALL LETTER A WITH RING ABOVE
+ (#xE6 ?\u00E6) ;; LATIN SMALL LETTER AE
+ (#xE7 ?\u00E7) ;; LATIN SMALL LETTER C WITH CEDILLA
+ (#xE8 ?\u00E8) ;; LATIN SMALL LETTER E WITH GRAVE
+ (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE
+ (#xEA ?\u00EA) ;; LATIN SMALL LETTER E WITH CIRCUMFLEX
+ (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS
+ (#xEC ?\u00EC) ;; LATIN SMALL LETTER I WITH GRAVE
+ (#xED ?\u00ED) ;; LATIN SMALL LETTER I WITH ACUTE
+ (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX
+ (#xEF ?\u00EF) ;; LATIN SMALL LETTER I WITH DIAERESIS
+ (#xF0 ?\u00F0) ;; LATIN SMALL LETTER ETH
+ (#xF1 ?\u00F1) ;; LATIN SMALL LETTER N WITH TILDE
+ (#xF2 ?\u00F2) ;; LATIN SMALL LETTER O WITH GRAVE
+ (#xF3 ?\u00F3) ;; LATIN SMALL LETTER O WITH ACUTE
+ (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX
+ (#xF5 ?\u00F5) ;; LATIN SMALL LETTER O WITH TILDE
+ (#xF6 ?\u00F6) ;; LATIN SMALL LETTER O WITH DIAERESIS
+ (#xF7 ?\u00F7) ;; DIVISION SIGN
+ (#xF8 ?\u00F8) ;; LATIN SMALL LETTER O WITH STROKE
+ (#xF9 ?\u00F9) ;; LATIN SMALL LETTER U WITH GRAVE
+ (#xFA ?\u00FA) ;; LATIN SMALL LETTER U WITH ACUTE
+ (#xFB ?\u00FB) ;; LATIN SMALL LETTER U WITH CIRCUMFLEX
+ (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS
+ (#xFD ?\u00FD) ;; LATIN SMALL LETTER Y WITH ACUTE
+ (#xFE ?\u00FE) ;; LATIN SMALL LETTER THORN
+ (#xFF ?\u00FF)) ;; LATIN SMALL LETTER Y WITH DIAERESIS
"ISO 4873 conforming 8-bit code (ASCII + Latin 9; aka Latin-1 with Euro)"
'(mnemonic "Latin 9"
aliases (iso-latin-9 latin-9 latin-0)))
@@ -852,46 +1311,134 @@ See also `iso-8859-2' and `window-1252'
;; Add a coding system for ISO 8859-16.
(make-8-bit-coding-system
'iso-8859-16
- '((#xA1 ?\u0104) ;; LATIN CAPITAL LETTER A WITH OGONEK
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u0104) ;; LATIN CAPITAL LETTER A WITH OGONEK
(#xA2 ?\u0105) ;; LATIN SMALL LETTER A WITH OGONEK
(#xA3 ?\u0141) ;; LATIN CAPITAL LETTER L WITH STROKE
(#xA4 ?\u20AC) ;; EURO SIGN
(#xA5 ?\u201E) ;; DOUBLE LOW-9 QUOTATION MARK
(#xA6 ?\u0160) ;; LATIN CAPITAL LETTER S WITH CARON
+ (#xA7 ?\u00A7) ;; SECTION SIGN
(#xA8 ?\u0161) ;; LATIN SMALL LETTER S WITH CARON
+ (#xA9 ?\u00A9) ;; COPYRIGHT SIGN
(#xAA ?\u0218) ;; LATIN CAPITAL LETTER S WITH COMMA BELOW
+ (#xAB ?\u00AB) ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
(#xAC ?\u0179) ;; LATIN CAPITAL LETTER Z WITH ACUTE
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
(#xAE ?\u017A) ;; LATIN SMALL LETTER Z WITH ACUTE
(#xAF ?\u017B) ;; LATIN CAPITAL LETTER Z WITH DOT ABOVE
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
+ (#xB1 ?\u00B1) ;; PLUS-MINUS SIGN
(#xB2 ?\u010C) ;; LATIN CAPITAL LETTER C WITH CARON
(#xB3 ?\u0142) ;; LATIN SMALL LETTER L WITH STROKE
(#xB4 ?\u017D) ;; LATIN CAPITAL LETTER Z WITH CARON
(#xB5 ?\u201D) ;; RIGHT DOUBLE QUOTATION MARK
+ (#xB6 ?\u00B6) ;; PILCROW SIGN
+ (#xB7 ?\u00B7) ;; MIDDLE DOT
(#xB8 ?\u017E) ;; LATIN SMALL LETTER Z WITH CARON
(#xB9 ?\u010D) ;; LATIN SMALL LETTER C WITH CARON
(#xBA ?\u0219) ;; LATIN SMALL LETTER S WITH COMMA BELOW
+ (#xBB ?\u00BB) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
(#xBC ?\u0152) ;; LATIN CAPITAL LIGATURE OE
(#xBD ?\u0153) ;; LATIN SMALL LIGATURE OE
(#xBE ?\u0178) ;; LATIN CAPITAL LETTER Y WITH DIAERESIS
(#xBF ?\u017C) ;; LATIN SMALL LETTER Z WITH DOT ABOVE
+ (#xC0 ?\u00C0) ;; LATIN CAPITAL LETTER A WITH GRAVE
+ (#xC1 ?\u00C1) ;; LATIN CAPITAL LETTER A WITH ACUTE
+ (#xC2 ?\u00C2) ;; LATIN CAPITAL LETTER A WITH CIRCUMFLEX
(#xC3 ?\u0102) ;; LATIN CAPITAL LETTER A WITH BREVE
+ (#xC4 ?\u00C4) ;; LATIN CAPITAL LETTER A WITH DIAERESIS
(#xC5 ?\u0106) ;; LATIN CAPITAL LETTER C WITH ACUTE
+ (#xC6 ?\u00C6) ;; LATIN CAPITAL LETTER AE
+ (#xC7 ?\u00C7) ;; LATIN CAPITAL LETTER C WITH CEDILLA
+ (#xC8 ?\u00C8) ;; LATIN CAPITAL LETTER E WITH GRAVE
+ (#xC9 ?\u00C9) ;; LATIN CAPITAL LETTER E WITH ACUTE
+ (#xCA ?\u00CA) ;; LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+ (#xCB ?\u00CB) ;; LATIN CAPITAL LETTER E WITH DIAERESIS
+ (#xCC ?\u00CC) ;; LATIN CAPITAL LETTER I WITH GRAVE
+ (#xCD ?\u00CD) ;; LATIN CAPITAL LETTER I WITH ACUTE
+ (#xCE ?\u00CE) ;; LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+ (#xCF ?\u00CF) ;; LATIN CAPITAL LETTER I WITH DIAERESIS
(#xD0 ?\u0110) ;; LATIN CAPITAL LETTER D WITH STROKE
(#xD1 ?\u0143) ;; LATIN CAPITAL LETTER N WITH ACUTE
+ (#xD2 ?\u00D2) ;; LATIN CAPITAL LETTER O WITH GRAVE
+ (#xD3 ?\u00D3) ;; LATIN CAPITAL LETTER O WITH ACUTE
+ (#xD4 ?\u00D4) ;; LATIN CAPITAL LETTER O WITH CIRCUMFLEX
(#xD5 ?\u0150) ;; LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
+ (#xD6 ?\u00D6) ;; LATIN CAPITAL LETTER O WITH DIAERESIS
(#xD7 ?\u015A) ;; LATIN CAPITAL LETTER S WITH ACUTE
(#xD8 ?\u0170) ;; LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
+ (#xD9 ?\u00D9) ;; LATIN CAPITAL LETTER U WITH GRAVE
+ (#xDA ?\u00DA) ;; LATIN CAPITAL LETTER U WITH ACUTE
+ (#xDB ?\u00DB) ;; LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+ (#xDC ?\u00DC) ;; LATIN CAPITAL LETTER U WITH DIAERESIS
(#xDD ?\u0118) ;; LATIN CAPITAL LETTER E WITH OGONEK
(#xDE ?\u021A) ;; LATIN CAPITAL LETTER T WITH COMMA BELOW
+ (#xDF ?\u00DF) ;; LATIN SMALL LETTER SHARP S
+ (#xE0 ?\u00E0) ;; LATIN SMALL LETTER A WITH GRAVE
+ (#xE1 ?\u00E1) ;; LATIN SMALL LETTER A WITH ACUTE
+ (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX
(#xE3 ?\u0103) ;; LATIN SMALL LETTER A WITH BREVE
+ (#xE4 ?\u00E4) ;; LATIN SMALL LETTER A WITH DIAERESIS
(#xE5 ?\u0107) ;; LATIN SMALL LETTER C WITH ACUTE
+ (#xE6 ?\u00E6) ;; LATIN SMALL LETTER AE
+ (#xE7 ?\u00E7) ;; LATIN SMALL LETTER C WITH CEDILLA
+ (#xE8 ?\u00E8) ;; LATIN SMALL LETTER E WITH GRAVE
+ (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE
+ (#xEA ?\u00EA) ;; LATIN SMALL LETTER E WITH CIRCUMFLEX
+ (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS
+ (#xEC ?\u00EC) ;; LATIN SMALL LETTER I WITH GRAVE
+ (#xED ?\u00ED) ;; LATIN SMALL LETTER I WITH ACUTE
+ (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX
+ (#xEF ?\u00EF) ;; LATIN SMALL LETTER I WITH DIAERESIS
(#xF0 ?\u0111) ;; LATIN SMALL LETTER D WITH STROKE
(#xF1 ?\u0144) ;; LATIN SMALL LETTER N WITH ACUTE
+ (#xF2 ?\u00F2) ;; LATIN SMALL LETTER O WITH GRAVE
+ (#xF3 ?\u00F3) ;; LATIN SMALL LETTER O WITH ACUTE
+ (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX
(#xF5 ?\u0151) ;; LATIN SMALL LETTER O WITH DOUBLE ACUTE
+ (#xF6 ?\u00F6) ;; LATIN SMALL LETTER O WITH DIAERESIS
(#xF7 ?\u015B) ;; LATIN SMALL LETTER S WITH ACUTE
(#xF8 ?\u0171) ;; LATIN SMALL LETTER U WITH DOUBLE ACUTE
+ (#xF9 ?\u00F9) ;; LATIN SMALL LETTER U WITH GRAVE
+ (#xFA ?\u00FA) ;; LATIN SMALL LETTER U WITH ACUTE
+ (#xFB ?\u00FB) ;; LATIN SMALL LETTER U WITH CIRCUMFLEX
+ (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS
(#xFD ?\u0119) ;; LATIN SMALL LETTER E WITH OGONEK
- (#xFE ?\u021B)) ;; LATIN SMALL LETTER T WITH COMMA BELOW
+ (#xFE ?\u021B) ;; LATIN SMALL LETTER T WITH COMMA BELOW
+ (#xFF ?\u00FF)) ;; LATIN SMALL LETTER Y WITH DIAERESIS
"ISO-8859-16 (Latin-10)"
'(mnemonic "Latin 10"
aliases (iso-latin-10)))
@@ -972,12 +1519,134 @@ See also `iso-8859-2' and `window-1252'
(make-8-bit-coding-system
'iso-8859-9
- '((#xD0 ?\u011E) ;; LATIN CAPITAL LETTER G WITH BREVE
+ '((#x80 ?\u0080) ;; <control>
+ (#x81 ?\u0081) ;; <control>
+ (#x82 ?\u0082) ;; <control>
+ (#x83 ?\u0083) ;; <control>
+ (#x84 ?\u0084) ;; <control>
+ (#x85 ?\u0085) ;; <control>
+ (#x86 ?\u0086) ;; <control>
+ (#x87 ?\u0087) ;; <control>
+ (#x88 ?\u0088) ;; <control>
+ (#x89 ?\u0089) ;; <control>
+ (#x8A ?\u008A) ;; <control>
+ (#x8B ?\u008B) ;; <control>
+ (#x8C ?\u008C) ;; <control>
+ (#x8D ?\u008D) ;; <control>
+ (#x8E ?\u008E) ;; <control>
+ (#x8F ?\u008F) ;; <control>
+ (#x90 ?\u0090) ;; <control>
+ (#x91 ?\u0091) ;; <control>
+ (#x92 ?\u0092) ;; <control>
+ (#x93 ?\u0093) ;; <control>
+ (#x94 ?\u0094) ;; <control>
+ (#x95 ?\u0095) ;; <control>
+ (#x96 ?\u0096) ;; <control>
+ (#x97 ?\u0097) ;; <control>
+ (#x98 ?\u0098) ;; <control>
+ (#x99 ?\u0099) ;; <control>
+ (#x9A ?\u009A) ;; <control>
+ (#x9B ?\u009B) ;; <control>
+ (#x9C ?\u009C) ;; <control>
+ (#x9D ?\u009D) ;; <control>
+ (#x9E ?\u009E) ;; <control>
+ (#x9F ?\u009F) ;; <control>
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u00A1) ;; INVERTED EXCLAMATION MARK
+ (#xA2 ?\u00A2) ;; CENT SIGN
+ (#xA3 ?\u00A3) ;; POUND SIGN
+ (#xA4 ?\u00A4) ;; CURRENCY SIGN
+ (#xA5 ?\u00A5) ;; YEN SIGN
+ (#xA6 ?\u00A6) ;; BROKEN BAR
+ (#xA7 ?\u00A7) ;; SECTION SIGN
+ (#xA8 ?\u00A8) ;; DIAERESIS
+ (#xA9 ?\u00A9) ;; COPYRIGHT SIGN
+ (#xAA ?\u00AA) ;; FEMININE ORDINAL INDICATOR
+ (#xAB ?\u00AB) ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xAC ?\u00AC) ;; NOT SIGN
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
+ (#xAE ?\u00AE) ;; REGISTERED SIGN
+ (#xAF ?\u00AF) ;; MACRON
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
+ (#xB1 ?\u00B1) ;; PLUS-MINUS SIGN
+ (#xB2 ?\u00B2) ;; SUPERSCRIPT TWO
+ (#xB3 ?\u00B3) ;; SUPERSCRIPT THREE
+ (#xB4 ?\u00B4) ;; ACUTE ACCENT
+ (#xB5 ?\u00B5) ;; MICRO SIGN
+ (#xB6 ?\u00B6) ;; PILCROW SIGN
+ (#xB7 ?\u00B7) ;; MIDDLE DOT
+ (#xB8 ?\u00B8) ;; CEDILLA
+ (#xB9 ?\u00B9) ;; SUPERSCRIPT ONE
+ (#xBA ?\u00BA) ;; MASCULINE ORDINAL INDICATOR
+ (#xBB ?\u00BB) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xBC ?\u00BC) ;; VULGAR FRACTION ONE QUARTER
+ (#xBD ?\u00BD) ;; VULGAR FRACTION ONE HALF
+ (#xBE ?\u00BE) ;; VULGAR FRACTION THREE QUARTERS
+ (#xBF ?\u00BF) ;; INVERTED QUESTION MARK
+ (#xC0 ?\u00C0) ;; LATIN CAPITAL LETTER A WITH GRAVE
+ (#xC1 ?\u00C1) ;; LATIN CAPITAL LETTER A WITH ACUTE
+ (#xC2 ?\u00C2) ;; LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+ (#xC3 ?\u00C3) ;; LATIN CAPITAL LETTER A WITH TILDE
+ (#xC4 ?\u00C4) ;; LATIN CAPITAL LETTER A WITH DIAERESIS
+ (#xC5 ?\u00C5) ;; LATIN CAPITAL LETTER A WITH RING ABOVE
+ (#xC6 ?\u00C6) ;; LATIN CAPITAL LETTER AE
+ (#xC7 ?\u00C7) ;; LATIN CAPITAL LETTER C WITH CEDILLA
+ (#xC8 ?\u00C8) ;; LATIN CAPITAL LETTER E WITH GRAVE
+ (#xC9 ?\u00C9) ;; LATIN CAPITAL LETTER E WITH ACUTE
+ (#xCA ?\u00CA) ;; LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+ (#xCB ?\u00CB) ;; LATIN CAPITAL LETTER E WITH DIAERESIS
+ (#xCC ?\u00CC) ;; LATIN CAPITAL LETTER I WITH GRAVE
+ (#xCD ?\u00CD) ;; LATIN CAPITAL LETTER I WITH ACUTE
+ (#xCE ?\u00CE) ;; LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+ (#xCF ?\u00CF) ;; LATIN CAPITAL LETTER I WITH DIAERESIS
+ (#xD0 ?\u011E) ;; LATIN CAPITAL LETTER G WITH BREVE
+ (#xD1 ?\u00D1) ;; LATIN CAPITAL LETTER N WITH TILDE
+ (#xD2 ?\u00D2) ;; LATIN CAPITAL LETTER O WITH GRAVE
+ (#xD3 ?\u00D3) ;; LATIN CAPITAL LETTER O WITH ACUTE
+ (#xD4 ?\u00D4) ;; LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+ (#xD5 ?\u00D5) ;; LATIN CAPITAL LETTER O WITH TILDE
+ (#xD6 ?\u00D6) ;; LATIN CAPITAL LETTER O WITH DIAERESIS
+ (#xD7 ?\u00D7) ;; MULTIPLICATION SIGN
+ (#xD8 ?\u00D8) ;; LATIN CAPITAL LETTER O WITH STROKE
+ (#xD9 ?\u00D9) ;; LATIN CAPITAL LETTER U WITH GRAVE
+ (#xDA ?\u00DA) ;; LATIN CAPITAL LETTER U WITH ACUTE
+ (#xDB ?\u00DB) ;; LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+ (#xDC ?\u00DC) ;; LATIN CAPITAL LETTER U WITH DIAERESIS
(#xDD ?\u0130) ;; LATIN CAPITAL LETTER I WITH DOT ABOVE
(#xDE ?\u015E) ;; LATIN CAPITAL LETTER S WITH CEDILLA
+ (#xDF ?\u00DF) ;; LATIN SMALL LETTER SHARP S
+ (#xE0 ?\u00E0) ;; LATIN SMALL LETTER A WITH GRAVE
+ (#xE1 ?\u00E1) ;; LATIN SMALL LETTER A WITH ACUTE
+ (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX
+ (#xE3 ?\u00E3) ;; LATIN SMALL LETTER A WITH TILDE
+ (#xE4 ?\u00E4) ;; LATIN SMALL LETTER A WITH DIAERESIS
+ (#xE5 ?\u00E5) ;; LATIN SMALL LETTER A WITH RING ABOVE
+ (#xE6 ?\u00E6) ;; LATIN SMALL LETTER AE
+ (#xE7 ?\u00E7) ;; LATIN SMALL LETTER C WITH CEDILLA
+ (#xE8 ?\u00E8) ;; LATIN SMALL LETTER E WITH GRAVE
+ (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE
+ (#xEA ?\u00EA) ;; LATIN SMALL LETTER E WITH CIRCUMFLEX
+ (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS
+ (#xEC ?\u00EC) ;; LATIN SMALL LETTER I WITH GRAVE
+ (#xED ?\u00ED) ;; LATIN SMALL LETTER I WITH ACUTE
+ (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX
+ (#xEF ?\u00EF) ;; LATIN SMALL LETTER I WITH DIAERESIS
(#xF0 ?\u011F) ;; LATIN SMALL LETTER G WITH BREVE
+ (#xF1 ?\u00F1) ;; LATIN SMALL LETTER N WITH TILDE
+ (#xF2 ?\u00F2) ;; LATIN SMALL LETTER O WITH GRAVE
+ (#xF3 ?\u00F3) ;; LATIN SMALL LETTER O WITH ACUTE
+ (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX
+ (#xF5 ?\u00F5) ;; LATIN SMALL LETTER O WITH TILDE
+ (#xF6 ?\u00F6) ;; LATIN SMALL LETTER O WITH DIAERESIS
+ (#xF7 ?\u00F7) ;; DIVISION SIGN
+ (#xF8 ?\u00F8) ;; LATIN SMALL LETTER O WITH STROKE
+ (#xF9 ?\u00F9) ;; LATIN SMALL LETTER U WITH GRAVE
+ (#xFA ?\u00FA) ;; LATIN SMALL LETTER U WITH ACUTE
+ (#xFB ?\u00FB) ;; LATIN SMALL LETTER U WITH CIRCUMFLEX
+ (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS
(#xFD ?\u0131) ;; LATIN SMALL LETTER DOTLESS I
- (#xFE ?\u015F)) ;; LATIN SMALL LETTER S WITH CEDILLA
+ (#xFE ?\u015F) ;; LATIN SMALL LETTER S WITH CEDILLA
+ (#xFF ?\u00FF)) ;; LATIN SMALL LETTER Y WITH DIAERESIS
"ISO-8859-9 (Latin-5)"
'(mnemonic "Latin 5"
aliases (iso-latin-5 latin-5)))
@@ -1270,7 +1939,103 @@ This language environment supports %s. "
(#x9B ?\u203A) ;; SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
(#x9C ?\u0153) ;; LATIN SMALL LIGATURE OE
(#x9E ?\u017E) ;; LATIN SMALL LETTER Z WITH CARON
- (#x9F ?\u0178));; LATIN CAPITAL LETTER Y WITH DIAERESIS
+ (#x9F ?\u0178) ;; LATIN CAPITAL LETTER Y WITH DIAERESIS
+ (#xA0 ?\u00A0) ;; NO-BREAK SPACE
+ (#xA1 ?\u00A1) ;; INVERTED EXCLAMATION MARK
+ (#xA2 ?\u00A2) ;; CENT SIGN
+ (#xA3 ?\u00A3) ;; POUND SIGN
+ (#xA4 ?\u00A4) ;; CURRENCY SIGN
+ (#xA5 ?\u00A5) ;; YEN SIGN
+ (#xA6 ?\u00A6) ;; BROKEN BAR
+ (#xA7 ?\u00A7) ;; SECTION SIGN
+ (#xA8 ?\u00A8) ;; DIAERESIS
+ (#xA9 ?\u00A9) ;; COPYRIGHT SIGN
+ (#xAA ?\u00AA) ;; FEMININE ORDINAL INDICATOR
+ (#xAB ?\u00AB) ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xAC ?\u00AC) ;; NOT SIGN
+ (#xAD ?\u00AD) ;; SOFT HYPHEN
+ (#xAE ?\u00AE) ;; REGISTERED SIGN
+ (#xAF ?\u00AF) ;; MACRON
+ (#xB0 ?\u00B0) ;; DEGREE SIGN
+ (#xB1 ?\u00B1) ;; PLUS-MINUS SIGN
+ (#xB2 ?\u00B2) ;; SUPERSCRIPT TWO
+ (#xB3 ?\u00B3) ;; SUPERSCRIPT THREE
+ (#xB4 ?\u00B4) ;; ACUTE ACCENT
+ (#xB5 ?\u00B5) ;; MICRO SIGN
+ (#xB6 ?\u00B6) ;; PILCROW SIGN
+ (#xB7 ?\u00B7) ;; MIDDLE DOT
+ (#xB8 ?\u00B8) ;; CEDILLA
+ (#xB9 ?\u00B9) ;; SUPERSCRIPT ONE
+ (#xBA ?\u00BA) ;; MASCULINE ORDINAL INDICATOR
+ (#xBB ?\u00BB) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xBC ?\u00BC) ;; VULGAR FRACTION ONE QUARTER
+ (#xBD ?\u00BD) ;; VULGAR FRACTION ONE HALF
+ (#xBE ?\u00BE) ;; VULGAR FRACTION THREE QUARTERS
+ (#xBF ?\u00BF) ;; INVERTED QUESTION MARK
+ (#xC0 ?\u00C0) ;; LATIN CAPITAL LETTER A WITH GRAVE
+ (#xC1 ?\u00C1) ;; LATIN CAPITAL LETTER A WITH ACUTE
+ (#xC2 ?\u00C2) ;; LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+ (#xC3 ?\u00C3) ;; LATIN CAPITAL LETTER A WITH TILDE
+ (#xC4 ?\u00C4) ;; LATIN CAPITAL LETTER A WITH DIAERESIS
+ (#xC5 ?\u00C5) ;; LATIN CAPITAL LETTER A WITH RING ABOVE
+ (#xC6 ?\u00C6) ;; LATIN CAPITAL LETTER AE
+ (#xC7 ?\u00C7) ;; LATIN CAPITAL LETTER C WITH CEDILLA
+ (#xC8 ?\u00C8) ;; LATIN CAPITAL LETTER E WITH GRAVE
+ (#xC9 ?\u00C9) ;; LATIN CAPITAL LETTER E WITH ACUTE
+ (#xCA ?\u00CA) ;; LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+ (#xCB ?\u00CB) ;; LATIN CAPITAL LETTER E WITH DIAERESIS
+ (#xCC ?\u00CC) ;; LATIN CAPITAL LETTER I WITH GRAVE
+ (#xCD ?\u00CD) ;; LATIN CAPITAL LETTER I WITH ACUTE
+ (#xCE ?\u00CE) ;; LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+ (#xCF ?\u00CF) ;; LATIN CAPITAL LETTER I WITH DIAERESIS
+ (#xD0 ?\u00D0) ;; LATIN CAPITAL LETTER ETH
+ (#xD1 ?\u00D1) ;; LATIN CAPITAL LETTER N WITH TILDE
+ (#xD2 ?\u00D2) ;; LATIN CAPITAL LETTER O WITH GRAVE
+ (#xD3 ?\u00D3) ;; LATIN CAPITAL LETTER O WITH ACUTE
+ (#xD4 ?\u00D4) ;; LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+ (#xD5 ?\u00D5) ;; LATIN CAPITAL LETTER O WITH TILDE
+ (#xD6 ?\u00D6) ;; LATIN CAPITAL LETTER O WITH DIAERESIS
+ (#xD7 ?\u00D7) ;; MULTIPLICATION SIGN
+ (#xD8 ?\u00D8) ;; LATIN CAPITAL LETTER O WITH STROKE
+ (#xD9 ?\u00D9) ;; LATIN CAPITAL LETTER U WITH GRAVE
+ (#xDA ?\u00DA) ;; LATIN CAPITAL LETTER U WITH ACUTE
+ (#xDB ?\u00DB) ;; LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+ (#xDC ?\u00DC) ;; LATIN CAPITAL LETTER U WITH DIAERESIS
+ (#xDD ?\u00DD) ;; LATIN CAPITAL LETTER Y WITH ACUTE
+ (#xDE ?\u00DE) ;; LATIN CAPITAL LETTER THORN
+ (#xDF ?\u00DF) ;; LATIN SMALL LETTER SHARP S
+ (#xE0 ?\u00E0) ;; LATIN SMALL LETTER A WITH GRAVE
+ (#xE1 ?\u00E1) ;; LATIN SMALL LETTER A WITH ACUTE
+ (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX
+ (#xE3 ?\u00E3) ;; LATIN SMALL LETTER A WITH TILDE
+ (#xE4 ?\u00E4) ;; LATIN SMALL LETTER A WITH DIAERESIS
+ (#xE5 ?\u00E5) ;; LATIN SMALL LETTER A WITH RING ABOVE
+ (#xE6 ?\u00E6) ;; LATIN SMALL LETTER AE
+ (#xE7 ?\u00E7) ;; LATIN SMALL LETTER C WITH CEDILLA
+ (#xE8 ?\u00E8) ;; LATIN SMALL LETTER E WITH GRAVE
+ (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE
+ (#xEA ?\u00EA) ;; LATIN SMALL LETTER E WITH CIRCUMFLEX
+ (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS
+ (#xEC ?\u00EC) ;; LATIN SMALL LETTER I WITH GRAVE
+ (#xED ?\u00ED) ;; LATIN SMALL LETTER I WITH ACUTE
+ (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX
+ (#xEF ?\u00EF) ;; LATIN SMALL LETTER I WITH DIAERESIS
+ (#xF0 ?\u00F0) ;; LATIN SMALL LETTER ETH
+ (#xF1 ?\u00F1) ;; LATIN SMALL LETTER N WITH TILDE
+ (#xF2 ?\u00F2) ;; LATIN SMALL LETTER O WITH GRAVE
+ (#xF3 ?\u00F3) ;; LATIN SMALL LETTER O WITH ACUTE
+ (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX
+ (#xF5 ?\u00F5) ;; LATIN SMALL LETTER O WITH TILDE
+ (#xF6 ?\u00F6) ;; LATIN SMALL LETTER O WITH DIAERESIS
+ (#xF7 ?\u00F7) ;; DIVISION SIGN
+ (#xF8 ?\u00F8) ;; LATIN SMALL LETTER O WITH STROKE
+ (#xF9 ?\u00F9) ;; LATIN SMALL LETTER U WITH GRAVE
+ (#xFA ?\u00FA) ;; LATIN SMALL LETTER U WITH ACUTE
+ (#xFB ?\u00FB) ;; LATIN SMALL LETTER U WITH CIRCUMFLEX
+ (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS
+ (#xFD ?\u00FD) ;; LATIN SMALL LETTER Y WITH ACUTE
+ (#xFE ?\u00FE) ;; LATIN SMALL LETTER THORN
+ (#xFF ?\u00FF));; LATIN SMALL LETTER Y WITH DIAERESIS
"Microsoft's extension of iso-8859-1 for Western Europe and the Americas.
"
'(mnemonic "cp1252"
aliases (cp1252)))
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/mule/mule-cmds.el
--- a/lisp/mule/mule-cmds.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/mule/mule-cmds.el Sat Feb 07 17:13:37 2009 +0000
@@ -231,9 +231,9 @@ Meaningful values for PROP include
VALUE is a fixed-width 8-bit coding system used to
display Unicode error sequences (using a face to make
it clear that the data is invalid). In Western Europe
- this is normally windows-1252; in the Russia and the
- former Soviet Union koi8-ru or windows-1251 makes more
- sense."
+ and the Americas this is normally windows-1252; in
+ Russia and the former Soviet Union koi8-ru or
+ windows-1251 makes more sense."
(if (symbolp lang-env)
(setq lang-env (symbol-name lang-env)))
(let (lang-slot prop-slot)
@@ -771,7 +771,7 @@ the language environment for the major l
(let ((invalid-sequence-coding-system
(get-language-info language-name 'invalid-sequence-coding-system))
(disp-table (specifier-instance current-display-table))
- glyph string)
+ glyph string unicode-error-lookup)
(when (consp invalid-sequence-coding-system)
(setq invalid-sequence-coding-system
(car invalid-sequence-coding-system)))
@@ -779,9 +779,15 @@ the language environment for the major l
#'(lambda (key entry)
(setq string (decode-coding-string (string entry)
invalid-sequence-coding-system))
- ;; Treat control characters specially:
- (when (string-match "^[\x00-\x1f\x80-\x9f]$" string)
- (setq string (format "^%c" (+ ?@ (aref string 0)))))
+ (when (= 1 (length string))
+ ;; Treat control characters specially:
+ (cond
+ ((string-match "^[\x00-\x1f\x80-\x9f]$" string)
+ (setq string (format "^%c" (+ ?@ (aref string 0)))))
+ ((setq unicode-error-lookup
+ (get-char-table (aref string 0)
+ unicode-error-default-translation-table))
+ (setq string (format "^%c" (+ ?@ unicode-error-lookup))))))
(setq glyph (make-glyph (vector 'string :data string)))
(set-glyph-face glyph 'unicode-invalid-sequence-warning-face)
(put-char-table key glyph disp-table)
@@ -939,7 +945,7 @@ It can be retrieved with `(get-char-code
(defun encoded-string-description (str coding-system)
"Return a pretty description of STR that is encoded by CODING-SYSTEM."
-; (setq str (string-as-unibyte str))
+ ;; XEmacs; no transformation to unibyte.
(mapconcat
(if (and coding-system (eq (coding-system-type coding-system) 'iso2022))
;; Try to get a pretty description for ISO 2022 escape sequences.
@@ -948,36 +954,8 @@ It can be retrieved with `(get-char-code
(function (lambda (x) (format "#x%02X" x))))
str " "))
-;; (defun encode-coding-char (char coding-system)
-;; "Encode CHAR by CODING-SYSTEM and return the resulting string.
-;; If CODING-SYSTEM can't safely encode CHAR, return nil."
-;; (if (cmpcharp char)
-;; (setq char (car (decompose-composite-char char 'list))))
-;; (let ((str1 (char-to-string char))
-;; (str2 (make-string 2 char))
-;; (safe-charsets (and coding-system
-;; (coding-system-get coding-system 'safe-charsets)))
-;; enc1 enc2 i1 i2)
-;; (when (or (eq safe-charsets t)
-;; (memq (char-charset char) safe-charsets))
-;; ;; We must find the encoded string of CHAR. But, just encoding
-;; ;; CHAR will put extra control sequences (usually to designate
-;; ;; ASCII charset) at the tail if type of CODING is ISO 2022.
-;; ;; To exclude such tailing bytes, we at first encode one-char
-;; ;; string and two-char string, then check how many bytes at the
-;; ;; tail of both encoded strings are the same.
-;;
-;; (setq enc1 (string-as-unibyte (encode-coding-string str1 coding-system))
-;; i1 (length enc1)
-;; enc2 (string-as-unibyte (encode-coding-string str2 coding-system))
-;; i2 (length enc2))
-;; (while (and (> i1 0) (= (aref enc1 (1- i1)) (aref enc2 (1- i2))))
-;; (setq i1 (1- i1) i2 (1- i2)))
-;;
-;; ;; Now (substring enc1 i1) and (substring enc2 i2) are the same,
-;; ;; and they are the extra control sequences at the tail to
-;; ;; exclude.
-;; (substring enc2 0 i2))))
+;; XEmacs;
+;; (defun encode-coding-char (char coding-system) in coding.el.
;; #### The following section is utter junk from mule-misc.el.
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/mule/mule-coding.el
--- a/lisp/mule/mule-coding.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/mule/mule-coding.el Sat Feb 07 17:13:37 2009 +0000
@@ -231,11 +231,13 @@ a file is read, not changed, and then wr
(defun make-8-bit-generate-helper (decode-table encode-table
encode-failure-octet)
- "Helper function for `make-8-bit-generate-encode-program', which see.
+ "Helper function,
`make-8-bit-generate-encode-program-and-skip-chars-strings',
+which see.
Deals with the case where ASCII and another character set can both be
encoded unambiguously and completely into the coding-system; if this is so,
-returns a list corresponding to such a ccl-program. If not, it returns nil. "
+returns a list comprised of such a ccl-program and the character set in
+question. If not, it returns a list with both entries nil."
(let ((tentative-encode-program-parts
(eval-when-compile
(let* ((vec-len 128)
@@ -337,11 +339,11 @@ message--it will not work. ")
(append other-charset-vector nil)
(copy-tree (second
tentative-encode-program-parts))))))
- encode-program))
-
-(defun make-8-bit-generate-encode-program (decode-table encode-table
- encode-failure-octet)
- "Generate a CCL program to decode a 8-bit fixed-width charset.
+ (values encode-program worth-trying)))
+
+(defun make-8-bit-generate-encode-program-and-skip-chars-strings
+ (decode-table encode-table encode-failure-octet)
+ "Generate a CCL program to encode a 8-bit fixed-width charset.
DECODE-TABLE must have 256 non-cons entries, and will be regarded as
describing a map from the octet corresponding to an offset in the
@@ -399,7 +401,13 @@ in compiled CCL code.\nIf that is not th
in compiled CCL code.\nIf that is not the case, and it appears not to
be--that's why you're getting this message--it will not work. ")
prog)))
- (ascii-encodes-as-itself nil))
+ (ascii-encodes-as-itself nil)
+ (control-1-encodes-as-itself t)
+ (invalid-sequence-code-point-start
+ (eval-when-compile
+ (char-to-unicode
+ (aref (decode-coding-string "\xd8\x00\x00\x00" 'utf-16-be)
3))))
+ further-char-set skip-chars invalid-sequences-skip-chars)
;; Is this coding system ASCII-compatible? If so, we can avoid the hash
;; table lookup for those characters.
@@ -418,17 +426,18 @@ be--that's why you're getting this messa
;; slow, a hash table lookup + mule-unicode conversion is done
;; for every character encoding.
(setq encode-program general-encode-program)
- (setq encode-program
- ;; Encode program with ascii-ascii mapping (based on a
- ;; character's mule character set), and one other mule
- ;; character set using table-based encoding, other
- ;; character sets using hash table lookups.
- ;; make-8-bit-non-ascii-completely-coveredp only returns
- ;; such a mapping if some non-ASCII charset with
- ;; characters in decode-table is entirely covered by
- ;; encode-table.
- (make-8-bit-generate-helper decode-table encode-table
- encode-failure-octet))
+ (multiple-value-setq
+ (encode-program further-char-set)
+ ;; Encode program with ascii-ascii mapping (based on a
+ ;; character's mule character set), and one other mule
+ ;; character set using table-based encoding, other
+ ;; character sets using hash table lookups.
+ ;; make-8-bit-non-ascii-completely-coveredp only returns
+ ;; such a mapping if some non-ASCII charset with
+ ;; characters in decode-table is entirely covered by
+ ;; encode-table.
+ (make-8-bit-generate-helper decode-table encode-table
+ encode-failure-octet))
(unless encode-program
;; If make-8-bit-non-ascii-completely-coveredp returned nil,
;; but ASCII still encodes as itself, do one-to-one mapping
@@ -441,7 +450,66 @@ be--that's why you're getting this messa
(logior (lsh encode-failure-octet 8)
#x14)))
(copy-tree encode-program)))
- encode-program))
+ (loop
+ for i from #x80 to #x9f
+ do (unless (= i (aref decode-table i))
+ (setq control-1-encodes-as-itself nil)
+ (return)))
+ (loop
+ for i from #x00 to #xFF
+ initially (setq skip-chars
+ (cond
+ ((and ascii-encodes-as-itself
+ control-1-encodes-as-itself further-char-set)
+ (concat "\x00-\x9f" (charset-skip-chars-string
+ further-char-set)))
+ ((and ascii-encodes-as-itself
+ control-1-encodes-as-itself)
+ "\x00-\x9f")
+ ((null ascii-encodes-as-itself)
+ (skip-chars-quote (apply #'string
+ (append decode-table nil))))
+ (further-char-set
+ (concat (charset-skip-chars-string 'ascii)
+ (charset-skip-chars-string further-char-set)))
+ (t
+ (charset-skip-chars-string 'ascii)))
+ invalid-sequences-skip-chars "")
+ with decoded-ucs = nil
+ with decoded = nil
+ with no-ascii-transparency-skip-chars-list =
+ (unless ascii-encodes-as-itself (append decode-table nil))
+ ;; Can't use #'match-string here, see:
+ ;;
http://mid.gmane.org/18829.34118.709782.704574@parhasard.net
+ with skip-chars-test =
+ #'(lambda (skip-chars-string testing)
+ (with-temp-buffer
+ (insert testing)
+ (goto-char (point-min))
+ (skip-chars-forward skip-chars-string)
+ (= (point) (point-max))))
+ do
+ (setq decoded (aref decode-table i)
+ decoded-ucs (char-to-unicode decoded))
+ (cond
+ ((<= invalid-sequence-code-point-start decoded-ucs
+ (+ invalid-sequence-code-point-start #xFF))
+ (setq invalid-sequences-skip-chars
+ (concat (string decoded)
+ invalid-sequences-skip-chars))
+ (assert (not (funcall skip-chars-test skip-chars decoded))
+ "This char should only be skipped with \
+`invalid-sequences-skip-chars', not by `skip-chars'"))
+ ((not (funcall skip-chars-test skip-chars decoded))
+ (if ascii-encodes-as-itself
+ (setq skip-chars (concat skip-chars (string decoded)))
+ (push decoded no-ascii-transparency-skip-chars-list))))
+ finally (unless ascii-encodes-as-itself
+ (setq skip-chars
+ (skip-chars-quote
+ (apply #'string
+ no-ascii-transparency-skip-chars-list)))))
+ (values encode-program skip-chars invalid-sequences-skip-chars)))
(defun make-8-bit-create-decode-encode-tables (unicode-map)
"Return a list \(DECODE-TABLE ENCODE-TABLE) given UNICODE-MAP.
@@ -453,7 +521,11 @@ to 256 distinct characters. "
(let ((decode-table (make-vector 256 nil))
(encode-table (make-hash-table :size 256))
(private-use-start (encode-char make-8-bit-private-use-start 'ucs))
- desired-ucs)
+ (invalid-sequence-code-point-start
+ (eval-when-compile
+ (char-to-unicode
+ (aref (decode-coding-string "\xd8\x00\x00\x00" 'utf-16-be)
3))))
+ desired-ucs decode-table-entry)
(loop for (external internal)
in unicode-map
@@ -475,24 +547,51 @@ most of them, at run time. ")
(int-to-char external)
encode-table))
- ;; Now, go through the decode table looking at the characters that
- ;; remain nil. If the XEmacs character with that integer is already in
- ;; the encode table, map the on-disk octet to a Unicode private use
- ;; character. Otherwise map the on-disk octet to the XEmacs character
- ;; with that numeric value, to make it clearer what it is.
+ ;; Now, go through the decode table. For octet values above #x7f, if the
+ ;; decode table entry is nil, this means that they have an undefined
+ ;; mapping (= they map to XEmacs characters with keys in
+ ;; unicode-error-default-translation-table); for octet values below or
+ ;; equal to #x7f, it means that they map to ASCII.
+
+ ;; If any entry (whether below or above #x7f) in the decode-table
+ ;; already maps to some character with a key in
+ ;; unicode-error-default-translation-table, it is treated as an
+ ;; undefined octet by `query-coding-region'. That is, it is not
+ ;; necessary for an octet value to be above #x7f for this to happen.
+
(dotimes (i 256)
- (when (null (aref decode-table i))
- ;; Find a free code point.
- (setq desired-ucs i)
- (while (gethash desired-ucs encode-table)
- ;; In the normal case, the code point chosen will be U+E0XY, where
- ;; XY is the hexadecimal octet on disk. In pathological cases
- ;; it'll be something else.
- (setq desired-ucs (+ private-use-start desired-ucs)
- private-use-start (+ private-use-start 1)))
- (puthash desired-ucs (int-to-char i) encode-table)
+ (setq decode-table-entry (aref decode-table i))
+ (if decode-table-entry
+ (when (get-char-table
+ decode-table-entry
+ unicode-error-default-translation-table)
+ ;; The caller is explicitly specifying that this octet
+ ;; corresponds to an invalid sequence on disk:
+ (assert (= (get-char-table
+ decode-table-entry
+ unicode-error-default-translation-table) i)
+ "Bad argument to `make-8-bit-coding-system'.
+If you're going to designate an octet with value below #x80 as invalid
+for this coding system, make sure to map it to the invalid sequence
+character corresponding to its octet value on disk. "))
+
+ ;; decode-table-entry is nil; either the octet is to be treated as
+ ;; contributing to an error sequence (when (> #x7f i)), or it should
+ ;; be attempted to treat it as ASCII-equivalent.
+ (setq desired-ucs (or (and (< i #x80) i)
+ (+ invalid-sequence-code-point-start i)))
+ (while (gethash desired-ucs encode-table)
+ (assert (not (< i #x80))
+ "UCS code point should not already be in encode-table!"
+ ;; There is one invalid sequence char per octet value;
+ ;; with eight-bit-fixed coding systems, it makes no sense
+ ;; for us to be multiply allocating them.
+ (gethash desired-ucs encode-table))
+ (setq desired-ucs (+ private-use-start desired-ucs)
+ private-use-start (+ private-use-start 1)))
+ (puthash desired-ucs (int-to-char i) encode-table)
(setq desired-ucs (if (> desired-ucs #xFF)
- (decode-char 'ucs desired-ucs)
+ (unicode-to-char desired-ucs)
;; So we get Latin-1 when run at dump time,
;; instead of JIT-allocated characters.
(int-to-char desired-ucs)))
@@ -546,8 +645,9 @@ disk to XEmacs characters for some fixed
(return-from category 'no-conversion))
finally return 'iso-8-1))
-(defun 8-bit-fixed-query-coding-region (begin end coding-system
- &optional buffer errorp highlightp)
+(defun 8-bit-fixed-query-coding-region (begin end coding-system &optional
+ buffer ignore-invalid-sequencesp
+ errorp highlightp)
"The `query-coding-region' implementation for 8-bit-fixed coding systems.
Uses the `8-bit-fixed-query-from-unicode' and `8-bit-fixed-query-skip-chars'
@@ -570,65 +670,79 @@ See that the documentation of `query-cod
(or (coding-system-get coding-system '8-bit-fixed-query-skip-chars)
(coding-system-get (coding-system-base coding-system)
'8-bit-fixed-query-skip-chars)))
+ (invalid-sequences-skip-chars
+ (or (coding-system-get coding-system
+ '8-bit-fixed-invalid-sequences-skip-chars)
+ (coding-system-get (coding-system-base coding-system)
+ '8-bit-fixed-invalid-sequences-skip-chars)))
(ranges (make-range-table))
+ (case-fold-search nil)
char-after fail-range-start fail-range-end previous-fail extent
- failed)
+ failed invalid-sequences-looking-at failed-reason
+ previous-failed-reason)
(check-type from-unicode hash-table)
(check-type skip-chars-arg string)
+ (check-type invalid-sequences-skip-chars string)
+ (setq invalid-sequences-looking-at
+ (if (equal "" invalid-sequences-skip-chars)
+ ;; Regexp that will never match.
+ #r".\{0,0\}"
+ (concat "[" invalid-sequences-skip-chars "]")))
+ (when ignore-invalid-sequencesp
+ (setq skip-chars-arg
+ (concat skip-chars-arg invalid-sequences-skip-chars)))
(save-excursion
(when highlightp
- (map-extents #'(lambda (extent ignored-arg)
- (when (eq 'query-coding-warning-face
- (extent-face extent))
- (delete-extent extent))) buffer begin end))
+ (query-coding-clear-highlights begin end buffer))
(goto-char begin buffer)
(skip-chars-forward skip-chars-arg end buffer)
(while (< (point buffer) end)
- ; (message
- ; "fail-range-start is %S, previous-fail %S, point is %S, end is %S"
- ; fail-range-start previous-fail (point buffer) end)
(setq char-after (char-after (point buffer) buffer)
fail-range-start (point buffer))
- ; (message "arguments are %S %S"
- ; (< (point buffer) end)
- ; (not (gethash (encode-char char-after 'ucs) from-unicode)))
(while (and
(< (point buffer) end)
- (not (gethash (encode-char char-after 'ucs) from-unicode)))
+ (or (and
+ (not (gethash (encode-char char-after 'ucs) from-unicode))
+ (setq failed-reason 'unencodable))
+ (and (not ignore-invalid-sequencesp)
+ (looking-at invalid-sequences-looking-at buffer)
+ (setq failed-reason 'invalid-sequence)))
+ (or (null previous-failed-reason)
+ (eq previous-failed-reason failed-reason)))
(forward-char 1 buffer)
(setq char-after (char-after (point buffer) buffer)
- failed t))
+ failed t
+ previous-failed-reason failed-reason))
(if (= fail-range-start (point buffer))
;; The character can actually be encoded by the coding
;; system; check the characters past it.
(forward-char 1 buffer)
;; The character actually failed.
- ; (message "past the move through, point now %S" (point buffer))
(when errorp
(error 'text-conversion-error
(format "Cannot encode %s using coding system"
(buffer-substring fail-range-start (point buffer)
buffer))
(coding-system-name coding-system)))
+ (assert (not (null previous-failed-reason)) t
+ "previous-failed-reason should always be non-nil here")
(put-range-table fail-range-start
;; If char-after is non-nil, we're not at
;; the end of the buffer.
(setq fail-range-end (if char-after
(point buffer)
(point-max buffer)))
- t ranges)
+ previous-failed-reason ranges)
+ (setq previous-failed-reason nil)
(when highlightp
- ; (message "highlighting")
(setq extent (make-extent fail-range-start fail-range-end buffer))
(set-extent-priority extent (+ mouse-highlight-priority 2))
(set-extent-face extent 'query-coding-warning-face))
(skip-chars-forward skip-chars-arg end buffer)))
- ; (message "about to give the result, ranges %S" ranges)
(if failed
(values nil ranges)
(values t nil)))))
-;;;###autoload
(defun make-8-bit-coding-system (name unicode-map &optional description props)
"Make and return a fixed-width 8-bit CCL coding system named NAME.
NAME must be a symbol, and UNICODE-MAP a list.
@@ -644,12 +758,20 @@ character sets will not be distinct when
character sets will not be distinct when written to disk, which is
less often what is intended.
-Any octets not mapped will be decoded into the ISO 8859-1 characters with
-the corresponding numeric value; unless another octet maps to that
-character, in which case the Unicode private use area will be used. This
-avoids spurious changes to files on disk when they contain octets that would
-be otherwise remapped to the canonical values for the corresponding
-characters in the coding system.
+Any octets not mapped, and with values above #x7f, will be decoded into
+XEmacs characters that reflect that their values are undefined. These
+characters will be displayed in a language-environment-specific way. See
+`unicode-error-default-translation-table' and the
+`invalid-sequence-coding-system' argument to `set-language-info'.
+
+These characters will normally be treated as invalid when checking whether
+text can be encoded with `query-coding-region'--see the
+IGNORE-INVALID-SEQUENCESP argument to that function to avoid this. It is
+possible to specify that octets with values less than #x80 (or indeed
+greater than it) be treated in this way, by specifying explicitly that they
+correspond to the character mapping to that octet in
+`unicode-error-default-translation-table'. Far fewer coding systems
+override the ASCII mapping, though, so this is not the default.
DESCRIPTION and PROPS are as in `make-coding-system', which see. This
function also accepts two additional (optional) properties in PROPS;
@@ -668,7 +790,8 @@ the code for tilde `~'. "
(char-to-int ?~)))
(aliases (plist-get props 'aliases))
(hash-table-sym (gentemp (format "%s-encode-table" name)))
- encode-program decode-program result decode-table encode-table)
+ encode-program decode-program result decode-table encode-table
+ skip-chars invalid-sequences-skip-chars)
;; Some more sanity checking.
(check-argument-range encode-failure-octet 0 #xFF)
@@ -685,10 +808,13 @@ the code for tilde `~'. "
;; Register the decode-table.
(define-translation-hash-table hash-table-sym encode-table)
- ;; Generate the programs.
- (setq decode-program (make-8-bit-generate-decode-program decode-table)
- encode-program (make-8-bit-generate-encode-program
- decode-table encode-table encode-failure-octet))
+ ;; Generate the programs and skip-chars strings.
+ (setq decode-program (make-8-bit-generate-decode-program decode-table))
+ (multiple-value-setq
+ (encode-program skip-chars invalid-sequences-skip-chars)
+ (make-8-bit-generate-encode-program-and-skip-chars-strings
+ decode-table encode-table encode-failure-octet))
+
(unless (vectorp encode-program)
(setq encode-program
(apply #'vector
@@ -709,10 +835,10 @@ the code for tilde `~'. "
(coding-system-put name 'category
(make-8-bit-choose-category decode-table))
(coding-system-put name '8-bit-fixed-query-skip-chars
- (skip-chars-quote
- (apply #'string (append decode-table nil))))
+ skip-chars)
+ (coding-system-put name '8-bit-fixed-invalid-sequences-skip-chars
+ invalid-sequences-skip-chars)
(coding-system-put name '8-bit-fixed-query-from-unicode encode-table)
-
(coding-system-put name 'query-coding-function
#'8-bit-fixed-query-coding-region)
(coding-system-put (intern (format "%s-unix" name))
@@ -751,7 +877,8 @@ the code for tilde `~'. "
(or (plist-get props 'encode-failure-octet) (char-to-int ?~)))
(aliases (plist-get props 'aliases))
encode-program decode-program
- decode-table encode-table)
+ decode-table encode-table
+ skip-chars invalid-sequences-skip-chars)
;; Some sanity checking.
(check-argument-range encode-failure-octet 0 #xFF)
@@ -761,25 +888,21 @@ the code for tilde `~'. "
(setq props (plist-remprop props 'encode-failure-octet)
props (plist-remprop props 'aliases))
- ;; Work out encode-table and decode-table.
+ ;; Work out encode-table and decode-table
(multiple-value-setq
- (decode-table encode-table)
- (make-8-bit-create-decode-encode-tables unicode-map))
-
- ;; Generate the decode and encode programs.
- (setq decode-program (make-8-bit-generate-decode-program decode-table)
- encode-program (make-8-bit-generate-encode-program
- decode-table encode-table encode-failure-octet))
+ (decode-table encode-table)
+ (make-8-bit-create-decode-encode-tables unicode-map))
+
+ ;; Generate the decode and encode programs, and the skip-chars
+ ;; arguments.
+ (setq decode-program (make-8-bit-generate-decode-program decode-table))
+ (multiple-value-setq
+ (encode-program skip-chars invalid-sequences-skip-chars)
+ (make-8-bit-generate-encode-program-and-skip-chars-strings
+ decode-table encode-table encode-failure-octet))
;; And return the generated code.
`(let ((encode-table-sym (gentemp (format "%s-encode-table"
',name)))
- ;; The case-fold-search bind shouldn't be necessary. If I take
- ;; it, out, though, I get:
- ;;
- ;; (invalid-read-syntax "Multiply defined symbol label" 1)
- ;;
- ;; when the file is byte compiled.
- (case-fold-search t)
(encode-table ,encode-table))
(define-translation-hash-table encode-table-sym encode-table)
(make-coding-system
@@ -797,8 +920,9 @@ the code for tilde `~'. "
(coding-system-put ',name 'category
',(make-8-bit-choose-category decode-table))
(coding-system-put ',name '8-bit-fixed-query-skip-chars
- ',(skip-chars-quote
- (apply #'string (append decode-table nil))))
+ ,skip-chars)
+ (coding-system-put ',name '8-bit-fixed-invalid-sequences-skip-chars
+ ,invalid-sequences-skip-chars)
(coding-system-put ',name '8-bit-fixed-query-from-unicode encode-table)
(coding-system-put ',name 'query-coding-function
#'8-bit-fixed-query-coding-region)
@@ -819,7 +943,9 @@ the code for tilde `~'. "
;; Ideally this would be in latin.el, but code-init.el uses it.
(make-8-bit-coding-system
'iso-8859-1
- '() ;; No differences from Latin 1.
+ (loop
+ for i from #x80 to #xff
+ collect (list i (int-char i))) ;; Identical to Latin-1.
"ISO-8859-1 (Latin-1)"
'(mnemonic "Latin 1"
documentation "The most used encoding of Western Europe and the Americas."
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/mule/vietnamese.el
--- a/lisp/mule/vietnamese.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/mule/vietnamese.el Sat Feb 07 17:13:37 2009 +0000
@@ -26,7 +26,7 @@
;;; Commentary:
-;; For Vietnames, the character sets VISCII and VSCII are supported.
+;; For Vietnamese, the character sets VISCII and VSCII are supported.
;;; Code:
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f lisp/unicode.el
--- a/lisp/unicode.el Thu Feb 05 21:18:37 2009 -0500
+++ b/lisp/unicode.el Sat Feb 07 17:13:37 2009 +0000
@@ -617,38 +617,69 @@ mapping from the error sequences to the
"Used by `unicode-query-coding-region' to skip chars with known
mappings.")
(defun unicode-query-coding-region (begin end coding-system
- &optional buffer errorp highlightp)
- "The `query-coding-region' implementation for Unicode coding systems."
+ &optional buffer ignore-invalid-sequencesp
+ errorp highlightp)
+ "The `query-coding-region' implementation for Unicode coding systems.
+
+Supports IGNORE-INVALID-SEQUENCESP, that is, XEmacs characters that reflect
+invalid octets on disk will be treated as encodable if this argument is
+specified, and as not encodable if it is not specified."
+
+ ;; Potential problem here; the octets that correspond to octets from #x00
+ ;; to #x7f on disk will be treated by utf-8 and utf-7 as invalid
+ ;; sequences, and thus, in theory, encodable.
+
(check-argument-type #'coding-system-p
(setq coding-system (find-coding-system coding-system)))
(check-argument-type #'integer-or-marker-p begin)
(check-argument-type #'integer-or-marker-p end)
- (let* ((skip-chars-arg unicode-query-coding-skip-chars-arg)
+ (let* ((skip-chars-arg (concat unicode-query-coding-skip-chars-arg
+ (if ignore-invalid-sequencesp
+ unicode-invalid-sequence-regexp-range
+ "")))
(ranges (make-range-table))
(looking-at-arg (concat "[" skip-chars-arg "]"))
+ (case-fold-search nil)
fail-range-start fail-range-end char-after failed
- extent)
+ extent char-unicode invalid-sequence-p failed-reason
+ previous-failed-reason)
(save-excursion
(when highlightp
- (map-extents #'(lambda (extent ignored-arg)
- (when (eq 'query-coding-warning-face
- (extent-face extent))
- (delete-extent extent))) buffer begin end))
+ (query-coding-clear-highlights begin end buffer))
(goto-char begin buffer)
(skip-chars-forward skip-chars-arg end buffer)
(while (< (point buffer) end)
-; (message
-; "fail-range-start is %S, point is %S, end is %S"
-; fail-range-start (point buffer) end)
(setq char-after (char-after (point buffer) buffer)
fail-range-start (point buffer))
(while (and
(< (point buffer) end)
(not (looking-at looking-at-arg))
- (= -1 (char-to-unicode char-after)))
+ (or (and
+ (= -1 (setq char-unicode (char-to-unicode char-after)))
+ (setq failed-reason 'unencodable))
+ (and (not ignore-invalid-sequencesp)
+ ;; The default case, with ignore-invalid-sequencesp
+ ;; not specified:
+ ;; If the character is in the Unicode range that
+ ;; corresponds to an invalid octet, we want to
+ ;; treat it as unencodable.
+ (<= (eval-when-compile
+ (char-to-unicode
+ (aref (decode-coding-string "\xd8\x00\x00\x00"
+ 'utf-16-be) 3)))
+ char-unicode)
+ (<= char-unicode
+ (eval-when-compile
+ (char-to-unicode
+ (aref (decode-coding-string "\xd8\x00\x00\xFF"
+ 'utf-16-be) 3))))
+ (setq failed-reason 'invalid-sequence)))
+ (or (null previous-failed-reason)
+ (eq previous-failed-reason failed-reason)))
(forward-char 1 buffer)
(setq char-after (char-after (point buffer) buffer)
- failed t))
+ failed t
+ previous-failed-reason failed-reason))
(if (= fail-range-start (point buffer))
;; The character can actually be encoded by the coding
;; system; check the characters past it.
@@ -660,13 +691,17 @@ mapping from the error sequences to the
(buffer-substring fail-range-start (point buffer)
buffer))
(coding-system-name coding-system)))
+ (assert
+ (not (null previous-failed-reason)) t
+ "If we've got here, previous-failed-reason should be non-nil.")
(put-range-table fail-range-start
;; If char-after is non-nil, we're not at
;; the end of the buffer.
(setq fail-range-end (if char-after
(point buffer)
(point-max buffer)))
- t ranges)
+ previous-failed-reason ranges)
+ (setq previous-failed-reason nil)
(when highlightp
(setq extent (make-extent fail-range-start fail-range-end buffer))
(set-extent-priority extent (+ mouse-highlight-priority 2))
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f tests/ChangeLog
--- a/tests/ChangeLog Thu Feb 05 21:18:37 2009 -0500
+++ b/tests/ChangeLog Sat Feb 07 17:13:37 2009 +0000
@@ -1,3 +1,11 @@ 2009-01-31 Aidan Kehoe <kehoea@parhasa
+2009-02-07 Aidan Kehoe <kehoea(a)parhasard.net>
+
+ * automated/query-coding-tests.el:
+ Add FAILING-CASE arguments to the Assert calls, making #'q-c-debug
+ mostly unnecessary. Remove #'q-c-debug.
+ Add new tests that use the IGNORE-INVALID-SEQUENCESP argument to
+ #'query-coding-region; rework the existing ones to respect it.
+
2009-01-31 Aidan Kehoe <kehoea(a)parhasard.net>
* automated/mule-tests.el:
diff -r 202cb69c4d87c4d04b06676366e8d57a60149691 -r
e0a8715fdb1fbeeee7cad2fe6d5d44fe1e44455f tests/automated/query-coding-tests.el
--- a/tests/automated/query-coding-tests.el Thu Feb 05 21:18:37 2009 -0500
+++ b/tests/automated/query-coding-tests.el Sat Feb 07 17:13:37 2009 +0000
@@ -30,28 +30,6 @@
;; some well-known coding systems.
(require 'bytecomp)
-
-(defun q-c-debug (&rest aerger)
- (let ((standard-output (get-buffer-create "query-coding-debug"))
- (fmt (condition-case nil
- (and (stringp (first aerger))
- (apply #'format aerger))
- (error nil))))
- (if fmt
- (progn
- (princ (apply #'format aerger))
- (terpri))
- (princ "--> ")
- (let ((i 1))
- (dolist (sgra aerger)
- (if (> i 1) (princ " "))
- (princ (format "%d. " i))
- (prin1 sgra)
- (incf i))
- (terpri)))))
-
-;; Comment this out if debugging:
-(defalias 'q-c-debug #'ignore)
(when (featurep 'mule)
(let ((ascii-chars-string (apply #'string
@@ -64,7 +42,7 @@
(with-temp-buffer
(insert ascii-chars-string)
;; First, check all the coding systems that are ASCII-transparent for
- ;; ASCII-transparency in the check.
+ ;; ASCII-transparency in query-coding-region.
(dolist (coding-system
(delete-duplicates
(mapcar #'(lambda (coding-system)
@@ -87,76 +65,142 @@
unix-coding-system)))
(coding-system-list nil))
:test #'eq))
- (q-c-debug "looking at coding system %S" (coding-system-name
- coding-system))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) coding-system)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert (eq t query-coding-succeeded)
+ (format "checking query-coding-region ASCII-transparency,
%s"
+ coding-system))
+ (Assert (null query-coding-table)
+ (format "checking query-coding-region ASCII-transparency,
%s"
+ coding-system)))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-string ascii-chars-string coding-system)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table))))
+ (Assert (eq t query-coding-succeeded)
+ (format "checking query-coding-string ASCII-transparency,
%s"
+ coding-system))
+ (Assert (null query-coding-table)
+ (format "checking query-coding-string ASCII-transparency,
%s"
+ coding-system))))
(delete-region (point-min) (point-max))
;; Check for success from the two Latin-1 coding systems
(insert latin-1-chars-string)
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) 'iso-8859-1-unix)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert (eq t query-coding-succeeded)
+ "checking query-coding-region iso-8859-1-transparency")
+ (Assert (null query-coding-table)
+ "checking query-coding-region iso-8859-1-transparency"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-string (buffer-string) 'iso-8859-1-unix)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert (eq t query-coding-succeeded)
+ "checking query-coding-string iso-8859-1-transparency")
+ (Assert (null query-coding-table)
+ "checking query-coding-string iso-8859-1-transparency"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-string (buffer-string) 'iso-latin-1-with-esc-unix)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert
+ (eq t query-coding-succeeded)
+ "checking query-coding-region iso-latin-1-with-esc-transparency")
+ (Assert
+ (null query-coding-table)
+ "checking query-coding-region iso-latin-1-with-esc-transparency"))
;; Make it fail, check that it fails correctly
(insert (decode-char 'ucs #x20AC)) ;; EURO SIGN
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) 'iso-8859-1-unix)
- (Assert (null query-coding-succeeded))
- (Assert (equal query-coding-table
- #s(range-table type start-closed-end-open data
- ((257 258) t)))))
+ (Assert
+ (null query-coding-succeeded)
+ "checking that query-coding-region fails, U+20AC, iso-8859-1")
+ (Assert
+ (equal query-coding-table
+ #s(range-table type start-closed-end-open data
+ ((257 258) unencodable)))
+ "checking query-coding-region fails correctly, U+20AC, iso-8859-1"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max)
'iso-latin-1-with-esc-unix)
;; Stupidly, this succeeds. The behaviour is compatible with
;; GNU, though, and we encourage people not to use
;; iso-latin-1-with-esc-unix anyway:
- (Assert query-coding-succeeded)
- (Assert (null query-coding-table)))
+ (Assert
+ query-coding-succeeded
+ "checking that query-coding-region succeeds, U+20AC, \
+iso-latin-with-esc-unix-1")
+ (Assert
+ (null query-coding-table)
+ "checking that query-coding-region succeeds, U+20AC, \
+iso-latin-with-esc-unix-1"))
;; Check that it errors correctly.
(setq text-conversion-error-signalled nil)
(condition-case nil
- (query-coding-region (point-min) (point-max) 'iso-8859-1-unix nil t)
+ (query-coding-region (point-min) (point-max) 'iso-8859-1-unix
+ (current-buffer) nil t)
(text-conversion-error
(setq text-conversion-error-signalled t)))
- (Assert text-conversion-error-signalled)
+ (Assert
+ text-conversion-error-signalled
+ "checking query-coding-region signals text-conversion-error correctly")
(setq text-conversion-error-signalled nil)
(condition-case nil
(query-coding-region (point-min) (point-max)
- 'iso-latin-1-with-esc-unix nil t)
+ 'iso-latin-1-with-esc-unix nil nil t)
(text-conversion-error
(setq text-conversion-error-signalled t)))
- (Assert (null text-conversion-error-signalled))
+ (Assert
+ (null text-conversion-error-signalled)
+ "checking query-coding-region doesn't signal
text-conversion-error")
(delete-region (point-min) (point-max))
(insert latin-1-chars-string)
(decode-coding-region (point-min) (point-max) 'windows-1252-unix)
(goto-char (point-max)) ;; #'decode-coding-region just messed up point.
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) 'windows-1252-unix)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert
+ (null query-coding-succeeded)
+ "check query-coding-region fails, windows-1252, invalid-sequences")
+ (Assert
+ (equal query-coding-table
+ #s(range-table type start-closed-end-open
+ data ((130 131) invalid-sequence
+ (142 143) invalid-sequence
+ (144 146) invalid-sequence
+ (158 159) invalid-sequence)))
+ "check query-coding-region fails, windows-1252, invalid-sequences"))
+ (multiple-value-bind (query-coding-succeeded query-coding-table)
+ (query-coding-region (point-min) (point-max) 'windows-1252-unix
+ (current-buffer) t)
+ (Assert
+ (eq t query-coding-succeeded)
+ "checking that query-coding-region succeeds, U+20AC, windows-1252")
+ (Assert
+ (null query-coding-table)
+ "checking that query-coding-region succeeds, U+20AC, windows-1252"))
(insert ?\x80)
(multiple-value-bind (query-coding-succeeded query-coding-table)
+ (query-coding-region (point-min) (point-max) 'windows-1252-unix
+ (current-buffer) t)
+ (Assert
+ (null query-coding-succeeded)
+ "checking that query-coding-region fails, U+0080, windows-1252")
+ (Assert
+ (equal query-coding-table
+ #s(range-table type start-closed-end-open data
+ ((257 258) unencodable)))
+ "checking that query-coding-region fails, U+0080, windows-1252"))
+ (multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) 'windows-1252-unix)
- (Assert (null query-coding-succeeded))
- (Assert (equal query-coding-table
- #s(range-table type start-closed-end-open data
- ((257 258) t)))))
+ (Assert
+ (null query-coding-succeeded)
+ "check query-coding-region fails, U+0080, invalid-sequence, cp1252")
+ (Assert
+ (equal query-coding-table
+ #s(range-table type start-closed-end-open
+ data ((130 131) invalid-sequence
+ (142 143) invalid-sequence
+ (144 146) invalid-sequence
+ (158 159) invalid-sequence
+ (257 258) unencodable)))
+ "check query-coding-region fails, U+0080, invalid-sequence, cp1252"))
;; Try a similar approach with koi8-o, the koi8 variant with
;; support for Old Church Slavonic.
(delete-region (point-min) (point-max))
@@ -164,29 +208,53 @@
(decode-coding-region (point-min) (point-max) 'koi8-o-unix)
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) 'koi8-o-unix)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert
+ (eq t query-coding-succeeded)
+ "checking that query-coding-region succeeds, koi8-o-unix")
+ (Assert
+ (null query-coding-table)
+ "checking that query-coding-region succeeds, koi8-o-unix"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) 'escape-quoted)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert (eq t query-coding-succeeded)
+ "checking that query-coding-region succeeds, escape-quoted")
+ (Assert (null query-coding-table)
+ "checking that query-coding-region succeeds, escape-quoted"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) 'windows-1252-unix)
- (Assert (null query-coding-succeeded))
- (Assert (equal query-coding-table
- #s(range-table type start-closed-end-open
- data ((129 131) t (132 133) t (139 140) t
- (141 146) t (155 156) t (157 161) t
- (162 170) t (173 176) t (178 187) t
- (189 192) t (193 257) t)))))
+ (Assert
+ (null query-coding-succeeded)
+ "checking that query-coding-region fails, windows-1252 and Cyrillic")
+ (Assert
+ (equal query-coding-table
+ #s(range-table type start-closed-end-open
+ data ((129 131) unencodable
+ (132 133) unencodable
+ (139 140) unencodable
+ (141 146) unencodable
+ (155 156) unencodable
+ (157 161) unencodable
+ (162 170) unencodable
+ (173 176) unencodable
+ (178 187) unencodable
+ (189 192) unencodable
+ (193 257) unencodable)))
+ "checking that query-coding-region fails, windows-1252 and
Cyrillic"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) 'koi8-r-unix)
- (Assert (null query-coding-succeeded))
- (Assert (equal query-coding-table
- #s(range-table type start-closed-end-open
- data ((129 154) t (155 161) t (162 164) t
- (165 177) t (178 180) t
- (181 192) t)))))
+ (Assert
+ (null query-coding-succeeded)
+ "checking that query-coding-region fails, koi8-r and OCS characters")
+ (Assert
+ (equal query-coding-table
+ #s(range-table type start-closed-end-open
+ data ((129 154) unencodable
+ (155 161) unencodable
+ (162 164) unencodable
+ (165 177) unencodable
+ (178 180) unencodable
+ (181 192) unencodable)))
+ "checking that query-coding-region fails, koi8-r and OCS characters"))
;; Check that the Unicode coding systems handle characters
;; without Unicode mappings.
(delete-region (point-min) (point-max))
@@ -210,19 +278,29 @@
utf-16-little-endian-bom))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) (point-max) coding-system)
- (Assert (null query-coding-succeeded))
+ (Assert (null query-coding-succeeded)
+ "checking unicode coding systems fail with unmapped chars")
(Assert (equal query-coding-table
#s(range-table type start-closed-end-open data
- ((173 174) t (209 210) t
- (254 255) t)))))
+ ((173 174) unencodable
+ (209 210) unencodable
+ (254 255) unencodable)))
+ "checking unicode coding systems fail with unmapped chars"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region (point-min) 173 coding-system)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert (eq t query-coding-succeeded)
+ "checking unicode coding systems succeed sans unmapped
chars")
+ (Assert
+ (null query-coding-table)
+ "checking unicode coding systems succeed sans unmapped chars"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region 174 209 coding-system)
- (Assert (eq t query-coding-succeeded))
- (Assert (null query-coding-table)))
+ (Assert
+ (eq t query-coding-succeeded)
+ "checking unicode coding systems succeed sans unmapped chars,
again")
+ (Assert
+ (null query-coding-table)
+ "checking unicode coding systems succeed sans unmapped chars
again"))
(multiple-value-bind (query-coding-succeeded query-coding-table)
(query-coding-region 210 254 coding-system)
(Assert (eq t query-coding-succeeded))
@@ -230,77 +308,143 @@
;; Check that it errors correctly.
(setq text-conversion-error-signalled nil)
(condition-case nil
- (query-coding-region (point-min) (point-max) coding-system nil t)
+ (query-coding-region (point-min) (point-max) coding-system
+ (current-buffer) nil t)
(text-conversion-error
(setq text-conversion-error-signalled t)))
- (Assert text-conversion-error-signalled)
+ (Assert text-conversion-error-signalled
+ "checking that unicode coding systems error correctly")
(setq text-conversion-error-signalled nil)
(condition-case nil
- (query-coding-region (point-min) 173 coding-system nil t)
+ (query-coding-region (point-min) 173 coding-system
+ (current-buffer)
+ nil t)
(text-conversion-error
(setq text-conversion-error-signalled t)))
- (Assert (null text-conversion-error-signalled)))
-
+ (Assert
+ (null text-conversion-error-signalled)
+ "checking that unicode coding systems do not error when
unnecessary"))
+
+ (delete-region (point-min) (point-max))
+ (insert (decode-coding-string "\xff\xff\xff\xff"
+ 'greek-iso-8bit-with-esc))
+ (insert (decode-coding-string "\xff\xff\xff\xff" 'utf-8))
+ (insert (decode-coding-string "\xff\xff\xff\xff"
+ 'greek-iso-8bit-with-esc))
+ (dolist (coding-system '(utf-8 utf-16 utf-16-little-endian
+ utf-32 utf-32-little-endian))
+ (multiple-value-bind (query-coding-succeeded query-coding-table)
+ (query-coding-region (point-min) (point-max) coding-system)
+ (Assert (null query-coding-succeeded)
+ (format
+ "checking %s fails with unmapped chars and invalid seqs"
+ coding-system))
+ (Assert (equal query-coding-table
+ #s(range-table type start-closed-end-open
+ data ((1 5) unencodable
+ (5 9) invalid-sequence
+ (9 13) unencodable)))
+ (format
+ "checking %s fails with unmapped chars and invalid seqs"
+ coding-system)))
+ (multiple-value-bind (query-coding-succeeded query-coding-table)
+ (query-coding-region (point-min) (point-max) coding-system
+ (current-buffer) t)
+ (Assert (null query-coding-succeeded)
+ (format
+ "checking %s fails with unmapped chars sans invalid seqs"
+ coding-system))
+ (Assert
+ (equal query-coding-table
+ #s(range-table type start-closed-end-open
+ data ((1 5) unencodable
+ (9 13) unencodable)))
+ (format
+ "checking %s fails correctly, unmapped chars sans invalid seqs"
+ coding-system))))
;; Now to test #'encode-coding-char. Most of the functionality was
;; tested in the query-coding-region tests above, so we don't go into
;; as much detail here.
- (Assert (null (encode-coding-char
- (decode-char 'ucs #x20ac) 'iso-8859-1)))
- (Assert (equal "\x80" (encode-coding-char
- (decode-char 'ucs #x20ac) 'windows-1252)))
+ (Assert
+ (null (encode-coding-char
+ (decode-char 'ucs #x20ac) 'iso-8859-1))
+ "check #'encode-coding-char doesn't think iso-8859-1 handles
U+20AC")
+ (Assert
+ (equal "\x80" (encode-coding-char
+ (decode-char 'ucs #x20ac) 'windows-1252))
+ "check #'encode-coding-char doesn't think windows-1252 handles
U+0080")
(delete-region (point-min) (point-max))
;; And #'unencodable-char-position.
(insert latin-1-chars-string)
(insert (decode-char 'ucs #x20ac))
- (Assert (= 257 (unencodable-char-position (point-min) (point-max)
- 'iso-8859-1)))
- (Assert (equal '(257) (unencodable-char-position (point-min) (point-max)
- 'iso-8859-1 1)))
+ (Assert
+ (= 257 (unencodable-char-position (point-min) (point-max)
+ 'iso-8859-1))
+ "check #'unencodable-char-position doesn't think latin-1 encodes
U+20AC")
+ (Assert
+ (equal '(257) (unencodable-char-position (point-min) (point-max)
+ 'iso-8859-1 1))
+ "check #'unencodable-char-position doesn't think latin-1 encodes
U+20AC")
;; Compatiblity, sigh:
- (Assert (equal '(257) (unencodable-char-position (point-min) (point-max)
- 'iso-8859-1 0)))
+ (Assert
+ (equal '(257) (unencodable-char-position (point-min) (point-max)
+ 'iso-8859-1 0))
+ "check #'unencodable-char-position has some borked GNU semantics")
(dotimes (i 6) (insert (decode-char 'ucs #x20ac)))
;; Check if it stops at one:
(Assert (equal '(257) (unencodable-char-position (point-min) (point-max)
- 'iso-8859-1 1)))
+ 'iso-8859-1 1))
+ "check #'unencodable-char-position stops at 1 when asked
to")
;; Check if it stops at four:
(Assert (equal '(260 259 258 257)
(unencodable-char-position (point-min) (point-max)
- 'iso-8859-1 4)))
+ 'iso-8859-1 4))
+ "check #'unencodable-char-position stops at 4 when asked
to")
;; Check whether it stops at seven:
(Assert (equal '(263 262 261 260 259 258 257)
(unencodable-char-position (point-min) (point-max)
- 'iso-8859-1 7)))
+ 'iso-8859-1 7))
+ "check #'unencodable-char-position stops at 7 when asked
to")
;; Check that it still stops at seven:
(Assert (equal '(263 262 261 260 259 258 257)
(unencodable-char-position (point-min) (point-max)
- 'iso-8859-1 2000)))
+ 'iso-8859-1 2000))
+ "check #'unencodable-char-position stops at 7 if 2000 asked
for")
;; Now, #'check-coding-systems-region.
;; UTF-8 should certainly be able to encode these characters:
(Assert (eq t (check-coding-systems-region (point-min) (point-max)
- '(utf-8))))
- (Assert (equal '((iso-8859-1 257 258 259 260 261 262 263)
- (windows-1252 129 131 132 133 134 135 136 137 138 139
- 140 141 143 146 147 148 149 150 151 152
- 153 154 155 156 157 159 160))
- (sort
- (check-coding-systems-region (point-min) (point-max)
- '(utf-8 iso-8859-1
- windows-1252))
- ;; (The sort is to make the algorithm irrelevant.)
- #'(lambda (left right)
- (string< (car left) (car right))))))
+ '(utf-8)))
+ "check #'check-coding-systems-region gives t if encoding
works")
+ (Assert
+ (equal '((iso-8859-1 257 258 259 260 261 262 263)
+ (windows-1252 129 130 131 132 133 134 135 136
+ 137 138 139 140 141 142 143 144
+ 145 146 147 148 149 150 151 152
+ 153 154 155 156 157 158 159 160))
+ (sort
+ (check-coding-systems-region (point-min) (point-max)
+ '(utf-8 iso-8859-1
+ windows-1252))
+ ;; (The sort is to make the algorithm irrelevant.)
+ #'(lambda (left right)
+ (string< (car left) (car right)))))
+ "check #'check-coding-systems-region behaves well given a list")
;; Ensure that the indices are all decreased by one when passed a
;; string:
- (Assert (equal '((iso-8859-1 256 257 258 259 260 261 262)
- (windows-1252 128 130 131 132 133 134 135 136 137 138
- 139 140 142 145 146 147 148 149 150 151
- 152 153 154 155 156 158 159))
- (sort
- (check-coding-systems-region (buffer-string) nil
- '(utf-8 iso-8859-1
- windows-1252))
- #'(lambda (left right)
- (string< (car left) (car right)))))))))
-
+ (Assert
+ (equal '((iso-8859-1 256 257 258 259 260 261 262)
+ (windows-1252 128 129 130 131 132 133 134 135
+ 136 137 138 139 140 141 142 143
+ 144 145 146 147 148 149 150 151
+ 152 153 154 155 156 157 158 159))
+ (sort
+ (check-coding-systems-region (buffer-string) nil
+ '(utf-8 iso-8859-1
+ windows-1252))
+ #'(lambda (left right)
+ (string< (car left) (car right)))))
+ "check #'check-coding-systems-region behaves given a string and
list"))))
+
+
+
_______________________________________________
XEmacs-Patches mailing list
XEmacs-Patches(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches