December 2008 - XEmacs-Patches - XEmacs Mailing List Archives

commit: Add #'query-coding-region tests; do the work necessary to get them running. 16 years, 6 months

Aidan Kehoe

changeset: 4568:1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 user: Aidan Kehoe <kehoea(a)parhasard.net> date: Sun Dec 28 14:46:24 2008 +0000 files: lisp/ChangeLog lisp/code-init.el lisp/coding.el lisp/mule/chinese.el lisp/mule/devanagari.el lisp/mule/hebrew.el lisp/mule/iso-with-esc.el lisp/mule/japanese.el lisp/mule/korean.el lisp/mule/latin.el lisp/mule/mule-coding.el lisp/mule/thai-xtis.el lisp/unicode.el src/ChangeLog src/coding-system-slots.h src/file-coding.c src/file-coding.h tests/ChangeLog tests/automated/query-coding-tests.el description: Add #'query-coding-region tests; do the work necessary to get them running. lisp/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea(a)parhasard.net> * coding.el (default-query-coding-region): Declare using defun*, so we can #'return-from to it on encountering a safe-charsets value of t. Comment out a few debug messages. (query-coding-region): Correct the docstring, it deals with a region, not a string. (unencodable-char-position): Correct the implementation for non-nil COUNT, special-case a zero value for count, treat it as one. Don't rely on dynamic scope when calling the main lambda. * unicode.el (unicode-query-coding-region): Comment out some debug messages here. * mule/mule-coding.el (8-bit-fixed-query-coding-region): Comment out some debug messages here. * code-init.el (raw-text): Add a safe-charsets property to this coding system. * mule/korean.el (iso-2022-int-1): * mule/korean.el (euc-kr): * mule/korean.el (iso-2022-kr): Add safe-charsets properties for these coding systems. * mule/japanese.el (iso-2022-jp): * mule/japanese.el (jis7): * mule/japanese.el (jis8): * mule/japanese.el (shift-jis): * mule/japanese.el (iso-2022-jp-1978-irv): * mule/japanese.el (euc-jp): Add safe-charsets properties for all these coding systems. * mule/iso-with-esc.el: Add safe-charsets properties to all the coding systems in here. Comment on the downside of a safe-charsets value of t for iso-latin-1-with-esc. * mule/hebrew.el (ctext-hebrew): Add a safe-charsets property for this coding system. * mule/devanagari.el (in-is13194-devanagari): Add a safe-charsets property for this coding system. * mule/chinese.el (cn-gb-2312): * mule/chinese.el (hz-gb-2312): * mule/chinese.el (big5): Add safe-charsets properties for these coding systems. * mule/latin.el (iso-8859-14): Add an implementation for this, using #'make-8-bit-coding-system. * mule/mule-coding.el (ctext): * mule/mule-coding.el (iso-2022-8bit-ss2): * mule/mule-coding.el (iso-2022-7bit-ss2): * mule/mule-coding.el (iso-2022-jp-2): * mule/mule-coding.el (iso-2022-7bit): * mule/mule-coding.el (iso-2022-8): * mule/mule-coding.el (escape-quoted): * mule/mule-coding.el (iso-2022-lock): Add safe-charsets properties for all these coding systems. src/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea(a)parhasard.net> * file-coding.c (Fmake_coding_system): Document our use of the safe-chars and safe-charsets properties, and the differences compared to GNU. (make_coding_system_1): Don't drop the safe-chars and safe-charsets properties. (Fcoding_system_property): Return the safe-chars and safe-charsets properties when asked for them. * file-coding.h (CODING_SYSTEM_SAFE_CHARSETS): * coding-system-slots.h: Make the safe-chars and safe-charsets slots available in these headers. tests/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea(a)parhasard.net> * automated/query-coding-tests.el: New file, testing the functionality of #'query-coding-region and #'query-coding-string. diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/ChangeLog --- a/lisp/ChangeLog Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/ChangeLog Sun Dec 28 14:46:24 2008 +0000 ＠＠ -1,3 +1,57 ＠＠ 2008-08-09 Aidan Kehoe <kehoea＠parhasa +2008-12-28 Aidan Kehoe <kehoea(a)parhasard.net> + + * coding.el (default-query-coding-region): + Declare using defun*, so we can #'return-from to it on + encountering a safe-charsets value of t. Comment out a few + debug messages. + (query-coding-region): + Correct the docstring, it deals with a region, not a string. + (unencodable-char-position): + Correct the implementation for non-nil COUNT, special-case a zero + value for count, treat it as one. Don't rely on dynamic scope when + calling the main lambda. + * unicode.el (unicode-query-coding-region): + Comment out some debug messages here. + * mule/mule-coding.el (8-bit-fixed-query-coding-region): + Comment out some debug messages here. + + * code-init.el (raw-text): + Add a safe-charsets property to this coding system. + * mule/korean.el (iso-2022-int-1): + * mule/korean.el (euc-kr): + * mule/korean.el (iso-2022-kr): + Add safe-charsets properties for these coding systems. + * mule/japanese.el (iso-2022-jp): + * mule/japanese.el (jis7): + * mule/japanese.el (jis8): + * mule/japanese.el (shift-jis): + * mule/japanese.el (iso-2022-jp-1978-irv): + * mule/japanese.el (euc-jp): + Add safe-charsets properties for all these coding systems. + * mule/iso-with-esc.el: + Add safe-charsets properties to all the coding systems in + here. Comment on the downside of a safe-charsets value of t for + iso-latin-1-with-esc. + * mule/hebrew.el (ctext-hebrew): + Add a safe-charsets property for this coding system. + * mule/devanagari.el (in-is13194-devanagari): + Add a safe-charsets property for this coding system. + * mule/chinese.el (cn-gb-2312): + * mule/chinese.el (hz-gb-2312): + * mule/chinese.el (big5): + Add safe-charsets properties for these coding systems. + * mule/latin.el (iso-8859-14): + Add an implementation for this, using #'make-8-bit-coding-system. + * mule/mule-coding.el (ctext): + * mule/mule-coding.el (iso-2022-8bit-ss2): + * mule/mule-coding.el (iso-2022-7bit-ss2): + * mule/mule-coding.el (iso-2022-jp-2): + * mule/mule-coding.el (iso-2022-7bit): + * mule/mule-coding.el (iso-2022-8): + * mule/mule-coding.el (escape-quoted): + * mule/mule-coding.el (iso-2022-lock): + Add safe-charsets properties for all these coding systems. + 2008-08-09 Aidan Kehoe <kehoea(a)parhasard.net> * mule/mule-coding.el (make-8-bit-coding-system): diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/code-init.el --- a/lisp/code-init.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/code-init.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -394,4 +394,6 ＠＠ For more information, see `reset-coding- (reset-language-environment) +(coding-system-put 'raw-text 'safe-charsets '(ascii control-1 latin-iso8859-1)) + ;;; code-init.el ends here diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/coding.el --- a/lisp/coding.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/coding.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -300,8 +300,8 ＠＠ this function has been called, this will (extent-face extent)) (delete-extent extent))) buffer begin end)) -(defun default-query-coding-region (begin end coding-system - &optional buffer errorp highlightp) +(defun* default-query-coding-region (begin end coding-system + &optional buffer errorp highlightp) "The default `query-coding-region' implementation. Uses the `safe-charsets' and `safe-chars' coding system properties. ＠＠ -324,8 +324,11 ＠＠ addition, characters that can be safely (gethash safe-charsets default-query-coding-region-safe-charset-skip-chars-map)) (ranges (make-range-table)) - fail-range-start fail-range-end previous-fail char-after + fail-range-start fail-range-end char-after looking-at-arg failed extent) + ;; Coding systems with a value of t for safe-charsets support everything. + (when (eq t safe-charsets) + (return-from default-query-coding-region (values t nil))) (unless skip-chars-arg (setq skip-chars-arg (puthash safe-charsets ＠＠ -355,9 +358,9 ＠＠ addition, characters that can be safely (goto-char begin buffer) (skip-chars-forward skip-chars-arg end buffer) (while (< (point buffer) end) - (message - "fail-range-start is %S, previous-fail %S, point is %S, end is %S" - fail-range-start previous-fail (point buffer) end) + ; (message + ; "fail-range-start is %S, point is %S, end is %S" + ; fail-range-start (point buffer) end) (setq char-after (char-after (point buffer) buffer) fail-range-start (point buffer)) (while (and ＠＠ -411,8 +414,8 ＠＠ region using `query-coding-warning-face' This function returns a list; the intention is that callers use `multiple-value-bind' or the related CL multiple value functions to deal -with it. The first element is `t' if the string can be encoded using -CODING-SYSTEM, or `nil' if not. The second element is `nil' if the string +with it. The first element is `t' if the region can be encoded using +CODING-SYSTEM, or `nil' if not. The second element is `nil' if the region can be encoded using CODING-SYSTEM; otherwise, it is a range table describing the positions of the unencodable characters. See `make-range-table'." ＠＠ -456,33 +459,42 ＠＠ If optional 5th argument STRING is non-n If optional 5th argument STRING is non-nil, it is a string to search for un-encodable characters. In that case, START and END are indexes in the string." - (flet ((thunk () - (multiple-value-bind (result ranges) - (query-coding-region start end coding-system) - (if result - ;; If query-coding-region thinks the entire region is - ;; encodable, result will be t, and the thunk should - ;; return nil, because there are no unencodable - ;; positions in the region. - nil - (if count - (block counted - (map-range-table - #'(lambda (begin end value) - (while (and (<= begin end) (<= begin count)) - (push begin result) - (incf begin)) - (if (> begin count) (return-from counted))) - ranges)) - (map-range-table - #'(lambda (begin end value) - (while (<= begin end) - (push begin result) - (incf begin))) ranges)) - result)))) + (let ((thunk + #'(lambda (start end coding-system &optional count) + (multiple-value-bind (result ranges) + (query-coding-region start end coding-system) + (if result + nil + (block worked-it-all-out + (if count + (map-range-table + #'(lambda (begin end value) + (while (and (< begin end) + (< (length result) count)) + (push begin result) + (incf begin)) + (when (= (length result) count) + (return-from worked-it-all-out result))) + ranges) + (map-range-table + #'(lambda (begin end value) + (return-from worked-it-all-out begin)) + ranges)) + (assert (not (null count)) t + "We should never reach this point with null COUNT.") + result)))))) + (check-argument-type #'integer-or-marker-p start) + (check-argument-type #'integer-or-marker-p end) + (check-coding-system coding-system) + (and count (check-argument-type #'natnump count) + ;; Special-case zero, sigh. + (if (zerop count) (setq count 1))) + (and string (check-argument-type #'stringp string)) (if string - (with-temp-buffer (insert string) (thunk)) - (thunk)))) + (with-temp-buffer + (insert string) + (funcall thunk start end coding-system count)) + (funcall thunk start end coding-system count)))) (defun encode-coding-char (char coding-system) "Encode CHAR by CODING-SYSTEM and return the resulting string. diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/chinese.el --- a/lisp/mule/chinese.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/chinese.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -157,6 +157,7 ＠＠ charset-g1 chinese-gb2312 charset-g2 chinese-sisheng charset-g3 t + safe-charsets (ascii chinese-gb2312 chinese-sisheng) mnemonic "Zh-GB/EUC" documentation "Chinese EUC (Extended Unix Code), the standard Chinese encoding on Unix. ＠＠ -190,6 +191,7 ＠＠ G2: Sisheng (PinYin - ZhuYin)" "Hz/ZW (Chinese)" '(mnemonic "Zh-GB/Hz" eol-type lf + safe-charsets (ascii chinese-gb2312) post-read-conversion post-read-decode-hz pre-write-conversion pre-write-encode-hz documentation "Hz/ZW 7-bit encoding for Chinese GB2312 (MIME:HZ-GB-2312)" ＠＠ -259,6 +261,7 ＠＠ Uses the GB2312 character set.")) 'big5 'big5 "Big5" '(mnemonic "Zh/Big5" + safe-charsets (ascii chinese-big5-1 chinese-big5-2) documentation "A non-modal encoding formed by five large Taiwanese companies \(hence \"Big5\") to produce a character set and encoding for diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/devanagari.el --- a/lisp/mule/devanagari.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/devanagari.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -50,6 +50,7 ＠＠ charset-g2 t charset-g3 t mnemonic "In-13194" + safe-charsets (ascii indian-is13194) documentation "8-bit encoding for ASCII (MSB=0) and IS13194-Devanagari (MSB=1)" safe-charsets (ascii indian-is13194) diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/hebrew.el --- a/lisp/mule/hebrew.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/hebrew.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -92,6 +92,7 ＠＠ charset-g1 hebrew-iso8859-8 charset-g2 t charset-g3 t + safe-charsets (ascii hebrew-iso8859-8) mnemonic "CText/Hbrw" )) diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/iso-with-esc.el --- a/lisp/mule/iso-with-esc.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/iso-with-esc.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -28,6 +28,10 ＠＠ ;;; Code: +;; It is not particularly reasonable that iso-latin-1-with-esc has a +;; value of t for the safe-charsets property. We discourage its use, +;; though, and this behaviour is compatible with GNU. + ;;;###autoload (define-coding-system-alias 'iso-latin-1-with-esc 'iso-2022-8) ＠＠ -38,6 +42,7 ＠＠ charset-g1 latin-iso8859-2 charset-g2 t charset-g3 t + safe-charsets (ascii latin-iso8859-2) mnemonic "MIME/Ltn-2")) ;;;###autoload ＠＠ -47,6 +52,7 ＠＠ charset-g1 latin-iso8859-3 charset-g2 t charset-g3 t + safe-charsets (ascii latin-iso8859-3) mnemonic "MIME/Ltn-3")) ;;;###autoload ＠＠ -56,6 +62,7 ＠＠ charset-g1 latin-iso8859-4 charset-g2 t charset-g3 t + safe-charsets (ascii latin-iso8859-4) mnemonic "MIME/Ltn-4")) ;;;###autoload ＠＠ -63,6 +70,7 ＠＠ 'iso-latin-9-with-esc 'iso2022 "ISO 4873 conforming 8-bit code (ASCII + Latin 9; aka Latin-1 with Euro)" '(mnemonic "MIME/Ltn-9" ; bletch + safe-charsets (ascii latin-iso8859-15) eol-type nil charset-g0 ascii charset-g1 latin-iso8859-15 ＠＠ -76,6 +84,7 ＠＠ charset-g1 latin-iso8859-9 charset-g2 t charset-g3 t + safe-charsets (ascii latin-iso8859-9) mnemonic "MIME/Ltn-5")) ;;;###autoload ＠＠ -86,6 +95,7 ＠＠ charset-g1 cyrillic-iso8859-5 charset-g2 t charset-g3 t + safe-charsets (ascii cyrillic-iso8859-5) mnemonic "ISO8/Cyr")) ;;;###autoload ＠＠ -97,6 +107,7 ＠＠ charset-g2 t charset-g3 t no-iso6429 t + safe-charsets (ascii hebrew-iso8859-8) mnemonic "MIME/Hbrw")) ;;;###autoload ＠＠ -106,6 +117,7 ＠＠ charset-g1 greek-iso8859-7 charset-g2 t charset-g3 t + safe-charsets (ascii greek-iso8859-7) mnemonic "Grk")) ;; ISO 8859-6 is such a useless character set that it seems a waste of ＠＠ -201,5 +213,6 ＠＠ charset-g2 t charset-g3 t no-iso6429 t + safe-charsets (ascii arabic-iso8859-6) mnemonic "MIME/Arbc")) diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/japanese.el --- a/lisp/mule/japanese.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/japanese.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -195,6 +195,8 ＠＠ seven t input-charset-conversion ((latin-jisx0201 ascii) (japanese-jisx0208-1978 japanese-jisx0208)) + safe-charsets (ascii japanese-jisx0208-1978 japanese-jisx0208 + latin-jisx0201 japanese-jisx0212 katakana-jisx0201) mnemonic "MULE/7bit" documentation "Coding system used for communication with mail and news in Japan." ＠＠ -210,6 +212,7 ＠＠ lock-shift t input-charset-conversion ((latin-jisx0201 ascii) (japanese-jisx0208-1978 japanese-jisx0208)) + safe-charsets (latin-jisx0201 ascii japanese-jisx0208-1978 japanese-jisx0208) mnemonic "JIS7" documentation "Old JIS 7-bit encoding; mostly superseded by ISO-2022-JP. ＠＠ -224,6 +227,8 ＠＠ Uses locking-shift (SI/SO) to select hal short t input-charset-conversion ((latin-jisx0201 ascii) (japanese-jisx0208-1978 japanese-jisx0208)) + safe-charsets (latin-jisx0201 ascii japanese-jisx0208-1978 + japanese-jisx0208) mnemonic "JIS8" documentation "Old JIS 8-bit encoding; mostly superseded by ISO-2022-JP. ＠＠ -261,6 +266,8 ＠＠ Uses high bytes for half-width katakana. "Shift-JIS" '(mnemonic "Ja/SJIS" documentation "The standard Japanese encoding in MS Windows." + safe-charsets (ascii japanese-jisx0208 japanese-jisx0208-1978 + latin-jisx0201 katakana-jisx0201) )) ;; A former name? ＠＠ -286,6 +293,8 ＠＠ Uses high bytes for half-width katakana. seven t output-charset-conversion ((ascii latin-jisx0201) (japanese-jisx0208 japanese-jisx0208-1978)) + safe-charsets (ascii latin-jisx0201 japanese-jisx0208 + japanese-jisx0208-1978) documentation "This is a coding system used for old JIS terminals. It's an ISO 2022 based 7-bit encoding for Japanese JISX0208-1978 and JISX0201-Roman." ＠＠ -314,6 +323,7 ＠＠ 2022 based 7-bit encoding for Japanese J charset-g1 japanese-jisx0208 charset-g2 katakana-jisx0201 charset-g3 japanese-jisx0212 + safe-charsets (ascii japanese-jisx0208 katakana-jisx0201 japanese-jisx0212) short t mnemonic "Ja/EUC" documentation diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/korean.el --- a/lisp/mule/korean.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/korean.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -57,6 +57,7 ＠＠ "ISO-2022-INT-1 (Korean)" '(charset-g0 ascii charset-g1 korean-ksc5601 + safe-charsets (ascii korean-ksc5601) short t seven t lock-shift t ＠＠ -92,6 +93,7 ＠＠ '(charset-g0 ascii charset-g1 korean-ksc5601 mnemonic "ko/EUC" + safe-charsets (ascii korean-ksc5601) documentation "Korean EUC (Extended Unix Code), the standard Korean encoding on Unix. This follows the same overall EUC principles (see the description under ＠＠ -122,6 +124,7 ＠＠ G1: Korean-KSC5601" force-g1-on-output t seven t lock-shift t + safe-charsets (ascii korean-ksc5601) mnemonic "Ko/7bit" documentation "Coding-System used for communication with mail in Korea." eol-type lf)) diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/latin.el --- a/lisp/mule/latin.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/latin.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -630,6 +630,43 ＠＠ See also `iso-8859-2' and `window-1252' (#xDC #xFC) ;; U WITH DIAERESIS (#xDD #xFD) ;; Y WITH ACUTE (#xDE #xFE))) ;; Y WITH CIRCUMFLEX + +(make-8-bit-coding-system + 'iso-8859-14 + '((#xA1 ?\u1E02) ;; LATIN CAPITAL LETTER B WITH DOT ABOVE + (#xA2 ?\u1E03) ;; LATIN SMALL LETTER B WITH DOT ABOVE + (#xA4 ?\u010A) ;; LATIN CAPITAL LETTER C WITH DOT ABOVE + (#xA5 ?\u010B) ;; LATIN SMALL LETTER C WITH DOT ABOVE + (#xA6 ?\u1E0A) ;; LATIN CAPITAL LETTER D WITH DOT ABOVE + (#xA8 ?\u1E80) ;; LATIN CAPITAL LETTER W WITH GRAVE + (#xAA ?\u1E82) ;; LATIN CAPITAL LETTER W WITH ACUTE + (#xAB ?\u1E0B) ;; LATIN SMALL LETTER D WITH DOT ABOVE + (#xAC ?\u1EF2) ;; LATIN CAPITAL LETTER Y WITH GRAVE + (#xAF ?\u0178) ;; LATIN CAPITAL LETTER Y WITH DIAERESIS + (#xB0 ?\u1E1E) ;; LATIN CAPITAL LETTER F WITH DOT ABOVE + (#xB1 ?\u1E1F) ;; LATIN SMALL LETTER F WITH DOT ABOVE + (#xB2 ?\u0120) ;; LATIN CAPITAL LETTER G WITH DOT ABOVE + (#xB3 ?\u0121) ;; LATIN SMALL LETTER G WITH DOT ABOVE + (#xB4 ?\u1E40) ;; LATIN CAPITAL LETTER M WITH DOT ABOVE + (#xB5 ?\u1E41) ;; LATIN SMALL LETTER M WITH DOT ABOVE + (#xB7 ?\u1E56) ;; LATIN CAPITAL LETTER P WITH DOT ABOVE + (#xB8 ?\u1E81) ;; LATIN SMALL LETTER W WITH GRAVE + (#xB9 ?\u1E57) ;; LATIN SMALL LETTER P WITH DOT ABOVE + (#xBA ?\u1E83) ;; LATIN SMALL LETTER W WITH ACUTE + (#xBB ?\u1E60) ;; LATIN CAPITAL LETTER S WITH DOT ABOVE + (#xBC ?\u1EF3) ;; LATIN SMALL LETTER Y WITH GRAVE + (#xBD ?\u1E84) ;; LATIN CAPITAL LETTER W WITH DIAERESIS + (#xBE ?\u1E85) ;; LATIN SMALL LETTER W WITH DIAERESIS + (#xBF ?\u1E61) ;; LATIN SMALL LETTER S WITH DOT ABOVE + (#xD0 ?\u0174) ;; LATIN CAPITAL LETTER W WITH CIRCUMFLEX + (#xD7 ?\u1E6A) ;; LATIN CAPITAL LETTER T WITH DOT ABOVE + (#xDE ?\u0176) ;; LATIN CAPITAL LETTER Y WITH CIRCUMFLEX + (#xF0 ?\u0175) ;; LATIN SMALL LETTER W WITH CIRCUMFLEX + (#xF7 ?\u1E6B) ;; LATIN SMALL LETTER T WITH DOT ABOVE + (#xFE ?\u0177)) ;; LATIN SMALL LETTER Y WITH CIRCUMFLEX + "ISO-8859-14 (Latin-8)" + '(mnemonic "Latin 8" + aliases (iso-latin-8 latin-8))) ;; The syntax table code for ISO 8859-15 and ISO 8859-16 requires that the diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/mule-coding.el --- a/lisp/mule/mule-coding.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/mule-coding.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -104,6 +104,7 ＠＠ The allowable range of REGISTER is 0 thr '(charset-g0 ascii charset-g1 latin-iso8859-1 eol-type nil + safe-charsets t ;; Reasonable mnemonic "CText")) (make-coding-system ＠＠ -113,6 +114,9 ＠＠ The allowable range of REGISTER is 0 thr charset-g1 latin-iso8859-1 charset-g2 t ;; unspecified but can be used later. short t + safe-charsets (ascii katakana-jisx0201 japanese-jisx0208-1978 + japanese-jisx0208 japanese-jisx0212 japanese-jisx0213-1 + japanese-jisx0213-2) mnemonic "ISO8/SS" documentation "ISO 2022 based 8-bit encoding using SS2 for 96-charset" )) ＠＠ -124,6 +128,7 ＠＠ The allowable range of REGISTER is 0 thr charset-g2 t ;; unspecified but can be used later. seven t short t + safe-charsets t mnemonic "ISO7/SS" documentation "ISO 2022 based 7-bit encoding using SS2 for 96-charset" eol-type nil)) ＠＠ -136,6 +141,7 ＠＠ The allowable range of REGISTER is 0 thr charset-g2 t ;; unspecified but can be used later. seven t short t + safe-charsets t mnemonic "ISO7/SS" eol-type nil)) ＠＠ -145,6 +151,7 ＠＠ The allowable range of REGISTER is 0 thr '(charset-g0 ascii seven t short t + safe-charsets t mnemonic "ISO7" documentation "ISO-2022-based 7-bit encoding using only G0" )) ＠＠ -158,6 +165,7 ＠＠ The allowable range of REGISTER is 0 thr '(charset-g0 ascii charset-g1 latin-iso8859-1 short t + safe-charsets t mnemonic "ISO8" documentation "ISO-2022 eight-bit coding system. No single-shift or locking-shift." )) ＠＠ -169,6 +177,7 ＠＠ The allowable range of REGISTER is 0 thr charset-g1 latin-iso8859-1 eol-type lf escape-quoted t + safe-charsets t mnemonic "ESC/Quot" documentation "ISO-2022 eight-bit coding system with escape quoting; used for .ELC files." )) ＠＠ -180,6 +189,7 ＠＠ The allowable range of REGISTER is 0 thr charset-g1 t ;; unspecified but can be used later. seven t lock-shift t + safe-charsets t mnemonic "ISO7/Lock" documentation "ISO-2022 coding system using Locking-Shift for 96-charset." )) ＠＠ -574,14 +584,14 ＠＠ See that the documentation of `query-cod (goto-char begin buffer) (skip-chars-forward skip-chars-arg end buffer) (while (< (point buffer) end) - (message - "fail-range-start is %S, previous-fail %S, point is %S, end is %S" - fail-range-start previous-fail (point buffer) end) + ; (message + ; "fail-range-start is %S, previous-fail %S, point is %S, end is %S" + ; fail-range-start previous-fail (point buffer) end) (setq char-after (char-after (point buffer) buffer) fail-range-start (point buffer)) - (message "arguments are %S %S" - (< (point buffer) end) - (not (gethash (encode-char char-after 'ucs) from-unicode))) + ; (message "arguments are %S %S" + ; (< (point buffer) end) + ; (not (gethash (encode-char char-after 'ucs) from-unicode))) (while (and (< (point buffer) end) (not (gethash (encode-char char-after 'ucs) from-unicode))) ＠＠ -593,7 +603,7 ＠＠ See that the documentation of `query-cod ;; system; check the characters past it. (forward-char 1 buffer) ;; The character actually failed. - (message "past the move through, point now %S" (point buffer)) + ; (message "past the move through, point now %S" (point buffer)) (when errorp (error 'text-conversion-error (format "Cannot encode %s using coding system" ＠＠ -608,12 +618,12 ＠＠ See that the documentation of `query-cod (point-max buffer))) t ranges) (when highlightp - (message "highlighting") + ; (message "highlighting") (setq extent (make-extent fail-range-start fail-range-end buffer)) (set-extent-priority extent (+ mouse-highlight-priority 2)) (set-extent-face extent 'query-coding-warning-face)) (skip-chars-forward skip-chars-arg end buffer))) - (message "about to give the result, ranges %S" ranges) + ; (message "about to give the result, ranges %S" ranges) (if failed (values nil ranges) (values t nil))))) diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/mule/thai-xtis.el --- a/lisp/mule/thai-xtis.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/mule/thai-xtis.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -355,6 +355,7 ＠＠ `(mnemonic "TIS620" decode ccl-decode-thai-xtis encode ccl-encode-thai-xtis + safe-charsets (ascii thai-xtis) documentation "external=tis620, internal=thai-xtis")) (coding-system-put 'tis-620 'category 'iso-8-1)) (make-coding-system diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 lisp/unicode.el --- a/lisp/unicode.el Sat Aug 09 13:15:09 2008 +0200 +++ b/lisp/unicode.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -626,7 +626,7 ＠＠ mapping from the error sequences to the (let* ((skip-chars-arg unicode-query-coding-skip-chars-arg) (ranges (make-range-table)) (looking-at-arg (concat "[" skip-chars-arg "]")) - fail-range-start fail-range-end previous-fail char-after failed + fail-range-start fail-range-end char-after failed extent) (save-excursion (when highlightp ＠＠ -638,8 +638,8 ＠＠ mapping from the error sequences to the (skip-chars-forward skip-chars-arg end buffer) (while (< (point buffer) end) ; (message -; "fail-range-start is %S, previous-fail %S, point is %S, end is %S" -; fail-range-start previous-fail (point buffer) end) +; "fail-range-start is %S, point is %S, end is %S" +; fail-range-start (point buffer) end) (setq char-after (char-after (point buffer) buffer) fail-range-start (point buffer)) (while (and ＠＠ -647,7 +647,6 ＠＠ mapping from the error sequences to the (not (looking-at looking-at-arg)) (= -1 (char-to-unicode char-after))) (forward-char 1 buffer) - (message "what?!?") (setq char-after (char-after (point buffer) buffer) failed t)) (if (= fail-range-start (point buffer)) diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 src/ChangeLog --- a/src/ChangeLog Sat Aug 09 13:15:09 2008 +0200 +++ b/src/ChangeLog Sun Dec 28 14:46:24 2008 +0000 ＠＠ -1,3 +1,17 ＠＠ 2008-08-05 Aidan Kehoe <kehoea＠parhasa +2008-12-28 Aidan Kehoe <kehoea(a)parhasard.net> + + * file-coding.c (Fmake_coding_system): + Document our use of the safe-chars and safe-charsets properties, + and the differences compared to GNU. + (make_coding_system_1): Don't drop the safe-chars and + safe-charsets properties. + (Fcoding_system_property): Return the safe-chars and safe-charsets + properties when asked for them. + * file-coding.h (CODING_SYSTEM_SAFE_CHARSETS): + * coding-system-slots.h: + Make the safe-chars and safe-charsets slots available in these + headers. + 2008-08-05 Aidan Kehoe <kehoea(a)parhasard.net> * mule-charset.c (complex_vars_of_mule_charset): diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 src/coding-system-slots.h --- a/src/coding-system-slots.h Sat Aug 09 13:15:09 2008 +0200 +++ b/src/coding-system-slots.h Sun Dec 28 14:46:24 2008 +0000 ＠＠ -105,6 +105,10 ＠＠ Boston, MA 02111-1307, USA. */ coding system). */ MARKED_SLOT (canonical) + MARKED_SLOT (safe_charsets) + + MARKED_SLOT (safe_chars) + #undef MARKED_SLOT #undef MARKED_SLOT_ARRAY #undef CODING_SYSTEM_SLOT_DECLARATION diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 src/file-coding.c --- a/src/file-coding.c Sat Aug 09 13:15:09 2008 +0200 +++ b/src/file-coding.c Sun Dec 28 14:46:24 2008 +0000 ＠＠ -1125,9 +1125,9 ＠＠ make_coding_system_1 (Lisp_Object name_o else if (EQ (key, Qtranslation_table_for_encode)) ; else if (EQ (key, Qsafe_chars)) - ; + CODING_SYSTEM_SAFE_CHARS (cs) = value; else if (EQ (key, Qsafe_charsets)) - ; + CODING_SYSTEM_SAFE_CHARSETS (cs) = value; else if (EQ (key, Qmime_charset)) ; else if (EQ (key, Qvalid_codes)) ＠＠ -1326,20 +1326,7 ＠＠ ignored: `translation-table-for-encode' The value is a translation table to be applied on encoding. This is not applicable to CCL-based coding systems. - -`safe-chars' - The value is a char table. If a character has non-nil value in it, - the character is safely supported by the coding system. This - overrides the specification of safe-charsets. - -`safe-charsets' - The value is a list of charsets safely supported by the coding - system. The value t means that all charsets Emacs handles are - supported. Even if some charset is not in this list, it doesn't - mean that the charset can't be encoded in the coding system; - it just means that some other receiver of text encoded - in the coding system won't be able to handle that charset. - + `mime-charset' The value is a symbol of which name is `MIME-charset' parameter of the coding system. ＠＠ -1350,7 +1337,27 ＠＠ ignored: In the former case, the integer value is a valid byte code. In the latter case, the integers specifies the range of valid byte codes. - +The following properties are used by `default-query-coding-region', +the default implementation of `query-coding-region'. This +implementation and these properties are not used by the Unicode coding +systems, nor by those CCL coding systems created with +`make-8-bit-coding-system'. + +`safe-chars' + The value is a char table. If a character has non-nil value in it, + the character is safely supported by the coding system. + Under XEmacs, for the moment, this is used in addition to the + `safe-charsets' property. It does not override it as it does + under GNU Emacs. #### We need to consider if we should keep this + behaviour. + +`safe-charsets' + The value is a list of charsets safely supported by the coding + system. For coding systems based on ISO 2022, XEmacs may try to + encode characters outside these character sets, but outside of + East Asia and East Asian coding systems, it is unlikely that + consumers of the data will understand XEmacs' encoding. + The value t means that all XEmacs character sets handles are supported. The following additional property is recognized if TYPE is `convert-eol': ＠＠ -1862,6 +1869,10 ＠＠ Return the PROP property of CODING-SYSTE return XCODING_SYSTEM_POST_READ_CONVERSION (coding_system); else if (EQ (prop, Qpre_write_conversion)) return XCODING_SYSTEM_PRE_WRITE_CONVERSION (coding_system); + else if (EQ (prop, Qsafe_charsets)) + return XCODING_SYSTEM_SAFE_CHARSETS (coding_system); + else if (EQ (prop, Qsafe_chars)) + return XCODING_SYSTEM_SAFE_CHARS (coding_system); else { Lisp_Object value = CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (coding_system), diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 src/file-coding.h --- a/src/file-coding.h Sat Aug 09 13:15:09 2008 +0200 +++ b/src/file-coding.h Sun Dec 28 14:46:24 2008 +0000 ＠＠ -583,6 +583,8 ＠＠ do { \ #define CODING_SYSTEM_AUTO_EOL_WRAPPER(codesys) ((codesys)->auto_eol_wrapper) #define CODING_SYSTEM_SUBSIDIARY_PARENT(codesys) ((codesys)->subsidiary_parent) #define CODING_SYSTEM_CANONICAL(codesys) ((codesys)->canonical) +#define CODING_SYSTEM_SAFE_CHARSETS(codesys) ((codesys)->safe_charsets) +#define CODING_SYSTEM_SAFE_CHARS(codesys) ((codesys)->safe_chars) #define CODING_SYSTEM_CHAIN_CHAIN(codesys) \ (CODING_SYSTEM_TYPE_DATA (codesys, chain)->chain) ＠＠ -623,6 +625,10 ＠＠ do { \ CODING_SYSTEM_SUBSIDIARY_PARENT (XCODING_SYSTEM (codesys)) #define XCODING_SYSTEM_CANONICAL(codesys) \ CODING_SYSTEM_CANONICAL (XCODING_SYSTEM (codesys)) +#define XCODING_SYSTEM_SAFE_CHARSETS(codesys) \ + CODING_SYSTEM_SAFE_CHARSETS (XCODING_SYSTEM (codesys)) +#define XCODING_SYSTEM_SAFE_CHARS(codesys) \ + CODING_SYSTEM_SAFE_CHARS (XCODING_SYSTEM (codesys)) #define XCODING_SYSTEM_CHAIN_CHAIN(codesys) \ CODING_SYSTEM_CHAIN_CHAIN (XCODING_SYSTEM (codesys)) diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 tests/ChangeLog --- a/tests/ChangeLog Sat Aug 09 13:15:09 2008 +0200 +++ b/tests/ChangeLog Sun Dec 28 14:46:24 2008 +0000 ＠＠ -1,3 +1,9 ＠＠ 2008-05-21 Aidan Kehoe <kehoea＠parhasa +2008-12-28 Aidan Kehoe <kehoea(a)parhasard.net> + + * automated/query-coding-tests.el: + New file, testing the functionality of #'query-coding-region and + #'query-coding-string. + 2008-05-21 Aidan Kehoe <kehoea(a)parhasard.net> * automated/mule-tests.el (featurep): diff -r 84d618b355f5fccd22634d6415a59a9339281ad4 -r 1d74a1d115ee2c6484333b2d37dd5fce9fe05ad6 tests/automated/query-coding-tests.el --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tests/automated/query-coding-tests.el Sun Dec 28 14:46:24 2008 +0000 ＠＠ -0,0 +1,293 ＠＠ +;; Copyright (C) 2008 Free Software Foundation, Inc. -*- coding: iso-8859-1 -*- + +;; Author: Aidan Kehoe <kehoea(a)parhasard.net> +;; Maintainer: Aidan Kehoe <kehoea(a)parhasard.net> +;; Created: 2008 +;; Keywords: tests, query-coding-region + +;; This file is part of XEmacs. + +;; XEmacs is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 2, or (at your option) +;; any later version. + +;; XEmacs is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with XEmacs; see the file COPYING. If not, write to the Free +;; Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA +;; 02111-1307, USA. + +;;; Synched up with: Not in FSF. + +;;; Commentary: + +;; Test the query-coding-region and query-coding-string implementations for +;; some well-known coding systems. + +(require 'bytecomp) + +(defun q-c-debug (&rest aerger) + (let ((standard-output (get-buffer-create "query-coding-debug")) + (fmt (condition-case nil + (and (stringp (first aerger)) + (apply #'format aerger)) + (error nil)))) + (if fmt + (progn + (princ (apply #'format aerger)) + (terpri)) + (princ "--> ") + (let ((i 1)) + (dolist (sgra aerger) + (if (> i 1) (princ " ")) + (princ (format "%d. " i)) + (prin1 sgra) + (incf i)) + (terpri))))) + +;; Comment this out if debugging: +(defalias 'q-c-debug #'ignore) + +(when (featurep 'mule) + (let ((ascii-chars-string (apply #'string + (loop for i from #x0 to #x7f + collect (int-to-char i)))) + (latin-1-chars-string (apply #'string + (loop for i from #x0 to #xff + collect (int-to-char i)))) + unix-coding-system text-conversion-error-signalled) + (with-temp-buffer + (insert ascii-chars-string) + ;; First, check all the coding systems that are ASCII-transparent for + ;; ASCII-transparency in the check. + (dolist (coding-system + (delete-duplicates + (mapcar #'(lambda (coding-system) + (unless (coding-system-alias-p coding-system) + ;; We're only interested in the version with + ;; Unix line endings right now. + (setq unix-coding-system + (subsidiary-coding-system + (coding-system-base coding-system) 'lf)) + (when (and + ;; ASCII-transparent + (equal ascii-chars-string + (encode-coding-string + ascii-chars-string + unix-coding-system)) + (not + (memq (coding-system-type + unix-coding-system) + '(undecided chain)))) + unix-coding-system))) + (coding-system-list nil)) +:test #'eq)) + (q-c-debug "looking at coding system %S" (coding-system-name + coding-system)) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) coding-system) + (q-c-debug "checking type, coding-system, q-c-s, q-c-t %S" + (list (coding-system-type coding-system) + coding-system query-coding-succeeded + query-coding-table)) + (unless (and (eq t query-coding-succeeded) + (null query-coding-table)) + (q-c-debug "(eq t query-coding-succeeded) %S, (\ +null query-coding-table) %S" (eq t query-coding-succeeded) + (null query-coding-table))) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (q-c-debug "testing the ASCII strings for %S" coding-system) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-string ascii-chars-string coding-system) + (unless (and (eq t query-coding-succeeded) + (null query-coding-table)) + (q-c-debug "(eq t query-coding-succeeded) %S, (\ +null query-coding-table) %S" (eq t query-coding-succeeded) + (null query-coding-table))) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table)))) + (q-c-debug "past the loop through the coding systems") + (delete-region (point-min) (point-max)) + ;; Check for success from the two Latin-1 coding systems + (insert latin-1-chars-string) + (q-c-debug "point is now %S" (point)) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) 'iso-8859-1-unix) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (q-c-debug "point is now %S" (point)) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-string (buffer-string) 'iso-8859-1-unix) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (q-c-debug "point is now %S" (point)) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-string (buffer-string) 'iso-latin-1-with-esc-unix) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (q-c-debug "point is now %S" (point)) + ;; Make it fail, check that it fails correctly + (insert (decode-char 'ucs #x20AC)) ;; EURO SIGN + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) 'iso-8859-1-unix) + (unless (and (null query-coding-succeeded) + (equal query-coding-table + #s(range-table type start-closed-end-open data + ((257 258) t)))) + (q-c-debug "dealing with %S" 'iso-8859-1-unix) + (q-c-debug "query-coding-succeeded not null, query-coding-table \ +%S" query-coding-table)) + (Assert (null query-coding-succeeded)) + (Assert (equal query-coding-table + #s(range-table type start-closed-end-open data + ((257 258) t))))) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) + 'iso-latin-1-with-esc-unix) + ;; Stupidly, this succeeds. The behaviour is compatible with + ;; GNU, though, and we encourage people not to use + ;; iso-latin-1-with-esc-unix anyway: + + (unless (and query-coding-succeeded + (null query-coding-table)) + (q-c-debug "dealing with %S" 'iso-latin-1-with-esc-unix) + (q-c-debug "query-coding-succeeded %S, query-coding-table \ +%S" query-coding-succeeded query-coding-table)) + (Assert query-coding-succeeded) + (Assert (null query-coding-table))) + ;; Check that it errors correctly. + (setq text-conversion-error-signalled nil) + (condition-case nil + (query-coding-region (point-min) (point-max) 'iso-8859-1-unix nil t) + (text-conversion-error + (setq text-conversion-error-signalled t))) + (Assert text-conversion-error-signalled) + (setq text-conversion-error-signalled nil) + (condition-case nil + (query-coding-region (point-min) (point-max) + 'iso-latin-1-with-esc-unix nil t) + (text-conversion-error + (setq text-conversion-error-signalled t))) + (Assert (null text-conversion-error-signalled)) + (delete-region (point-min) (point-max)) + (insert latin-1-chars-string) + (decode-coding-region (point-min) (point-max) 'windows-1252-unix) + (goto-char (point-max)) ;; #'decode-coding-region just messed up point. + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) 'windows-1252-unix) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (insert ?\x80) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) 'windows-1252-unix) + (unless (and (null query-coding-succeeded) + (equal query-coding-table + #s(range-table type start-closed-end-open data + ((257 258) t)))) + (q-c-debug "dealing with %S" 'windows-1252-unix) + (q-c-debug "query-coding-succeeded not null, query-coding-table \ +%S" query-coding-table)) + (Assert (null query-coding-succeeded)) + (Assert (equal query-coding-table + #s(range-table type start-closed-end-open data + ((257 258) t))))) + ;; Try a similar approach with koi8-o, the koi8 variant with + ;; support for Old Church Slavonic. + (delete-region (point-min) (point-max)) + (insert latin-1-chars-string) + (decode-coding-region (point-min) (point-max) 'koi8-o-unix) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) 'koi8-o-unix) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) 'escape-quoted) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) 'windows-1252-unix) + (unless (and (null query-coding-succeeded) + (equal query-coding-table + #s(range-table type start-closed-end-open + data ((129 131) t (132 133) t + (139 140) t (141 146) t + (155 156) t (157 161) t + (162 170) t (173 176) t + (178 187) t (189 192) t + (193 257) t)))) + (q-c-debug "query-coding-succeeded not null, query-coding-table \ +%S" query-coding-table)) + (Assert (null query-coding-succeeded)) + (Assert (equal query-coding-table + #s(range-table type start-closed-end-open + data ((129 131) t (132 133) t (139 140) t + (141 146) t (155 156) t (157 161) t + (162 170) t (173 176) t (178 187) t + (189 192) t (193 257) t))))) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) 'koi8-r-unix) + (Assert (null query-coding-succeeded)) + (Assert (equal query-coding-table + #s(range-table type start-closed-end-open + data ((129 154) t (155 161) t (162 164) t + (165 177) t (178 180) t + (181 192) t))))) + ;; Check that the Unicode coding systems handle characters + ;; without Unicode mappings. + (delete-region (point-min) (point-max)) + (insert latin-1-chars-string) + (decode-coding-region (point-min) (point-max) 'greek-iso-8bit-with-esc) + (dolist (coding-system + '(utf-16-mac ucs-4-mac utf-16-little-endian-bom-dos ucs-4-dos + utf-16-little-endian-mac utf-16-bom-unix + utf-16-little-endian ucs-4 utf-16-dos + ucs-4-little-endian-dos utf-16-bom-mac utf-16-bom + utf-16-unix utf-32-unix utf-32-little-endian + utf-32-dos utf-32 utf-32-little-endian-dos utf-8-bom + utf-16-bom-dos ucs-4-unix + utf-16-little-endian-bom-unix utf-8-bom-mac + utf-32-little-endian-unix utf-16 + utf-16-little-endian-dos utf-16-little-endian-bom-mac + utf-8-bom-dos ucs-4-little-endian-mac utf-8-bom-unix + utf-32-little-endian-mac utf-8-dos utf-8-unix + utf-32-mac utf-8-mac utf-16-little-endian-unix + ucs-4-little-endian ucs-4-little-endian-unix utf-8 + utf-16-little-endian-bom)) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) (point-max) coding-system) + (Assert (null query-coding-succeeded)) + (Assert (equal query-coding-table + #s(range-table type start-closed-end-open data + ((173 174) t (209 210) t + (254 255) t))))) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region (point-min) 173 coding-system) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region 174 209 coding-system) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + (multiple-value-bind (query-coding-succeeded query-coding-table) + (query-coding-region 210 254 coding-system) + (Assert (eq t query-coding-succeeded)) + (Assert (null query-coding-table))) + ;; Check that it errors correctly. + (setq text-conversion-error-signalled nil) + (condition-case nil + (query-coding-region (point-min) (point-max) coding-system nil t) + (text-conversion-error + (setq text-conversion-error-signalled t))) + (Assert text-conversion-error-signalled) + (setq text-conversion-error-signalled nil) + (condition-case nil + (query-coding-region (point-min) 173 coding-system nil t) + (text-conversion-error + (setq text-conversion-error-signalled t))) + (Assert (null text-conversion-error-signalled)))))) _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

commit: 2008-08-09 Aidan Kehoe <kehoea@parhasard.net> 16 years, 6 months

Aidan Kehoe

changeset: 4567:84d618b355f5fccd22634d6415a59a9339281ad4 user: Aidan Kehoe <kehoea(a)parhasard.net> date: Sat Aug 09 13:15:09 2008 +0200 files: lisp/ChangeLog lisp/mule/general-late.el lisp/mule/mule-coding.el description: 2008-08-09 Aidan Kehoe <kehoea(a)parhasard.net> * mule/mule-coding.el (make-8-bit-coding-system): * mule/general-late.el (posix-charset-to-coding-system-hash): Use #'skip-chars-quote as appropriate. diff -r 26aae3bacf99fd641cbc1626d3ee5a8572b28703 -r 84d618b355f5fccd22634d6415a59a9339281ad4 lisp/ChangeLog --- a/lisp/ChangeLog Sat Aug 09 13:11:06 2008 +0200 +++ b/lisp/ChangeLog Sat Aug 09 13:15:09 2008 +0200 ＠＠ -1,3 +1,9 ＠＠ 2008-08-09 Aidan Kehoe <kehoea＠parhasa +2008-08-09 Aidan Kehoe <kehoea(a)parhasard.net> + + * mule/mule-coding.el (make-8-bit-coding-system): + * mule/general-late.el (posix-charset-to-coding-system-hash): + Use #'skip-chars-quote as appropriate. + 2008-08-09 Aidan Kehoe <kehoea(a)parhasard.net> * subr.el (skip-chars-quote): New. ＠＠ -6,8 +12,6 ＠＠ 2008-08-09 Aidan Kehoe <kehoea＠parhasa #'skip-chars-backward. 2008-08-09 Aidan Kehoe <kehoea(a)parhasard.net> - - * subr.el (skip-chars-quote): New. * mule/cyril-util.el: Remove. Use the version in packages instead. diff -r 26aae3bacf99fd641cbc1626d3ee5a8572b28703 -r 84d618b355f5fccd22634d6415a59a9339281ad4 lisp/mule/general-late.el --- a/lisp/mule/general-late.el Sat Aug 09 13:11:06 2008 +0200 +++ b/lisp/mule/general-late.el Sat Aug 09 13:15:09 2008 +0200 ＠＠ -90,7 +90,7 ＠＠ (setq skip-chars-string (concat skip-chars-string (charset-skip-chars-string charset)))) - finally return skip-chars-string)))) + finally return (skip-chars-quote skip-chars-string))))) ;; At this point in the dump, all the charsets have been loaded. Now, load ;; their Unicode mappings. diff -r 26aae3bacf99fd641cbc1626d3ee5a8572b28703 -r 84d618b355f5fccd22634d6415a59a9339281ad4 lisp/mule/mule-coding.el --- a/lisp/mule/mule-coding.el Sat Aug 09 13:11:06 2008 +0200 +++ b/lisp/mule/mule-coding.el Sat Aug 09 13:15:09 2008 +0200 ＠＠ -699,7 +699,8 ＠＠ the code for tilde `~'. " (coding-system-put name 'category (make-8-bit-choose-category decode-table)) (coding-system-put name '8-bit-fixed-query-skip-chars - (apply #'string (append decode-table nil))) + (skip-chars-quote + (apply #'string (append decode-table nil)))) (coding-system-put name '8-bit-fixed-query-from-unicode encode-table) (coding-system-put name 'query-coding-function ＠＠ -786,7 +787,8 ＠＠ the code for tilde `~'. " (coding-system-put ',name 'category ',(make-8-bit-choose-category decode-table)) (coding-system-put ',name '8-bit-fixed-query-skip-chars - ',(apply #'string (append decode-table nil))) + ',(skip-chars-quote + (apply #'string (append decode-table nil)))) (coding-system-put ',name '8-bit-fixed-query-from-unicode encode-table) (coding-system-put ',name 'query-coding-function #'8-bit-fixed-query-coding-region) _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

commit: Add #'skip-chars-quote to subr.el 16 years, 6 months

Aidan Kehoe

changeset: 4565:31344162cf9ada294e34a48c10daa1242b6e310e parent: 4493:89406c31b12521258c83389d19f80440a3a7ce48 user: Aidan Kehoe <kehoea(a)parhasard.net> date: Sat Aug 09 13:06:24 2008 +0200 files: lisp/ChangeLog lisp/subr.el description: Add #'skip-chars-quote to subr.el 2008-08-09 Aidan Kehoe <kehoea(a)parhasard.net> * subr.el (skip-chars-quote): New. Given STRING, return a string that means that all characters in STRING will be skipped when passed to #'skip-chars-forward, #'skip-chars-backward. diff -r 89406c31b12521258c83389d19f80440a3a7ce48 -r 31344162cf9ada294e34a48c10daa1242b6e310e lisp/ChangeLog --- a/lisp/ChangeLog Sat Aug 09 12:13:19 2008 +0200 +++ b/lisp/ChangeLog Sat Aug 09 13:06:24 2008 +0200 ＠＠ -1,4 +1,13 ＠＠ 2008-08-09 Aidan Kehoe <kehoea＠parhasa 2008-08-09 Aidan Kehoe <kehoea(a)parhasard.net> + + * subr.el (skip-chars-quote): New. + Given STRING, return a string that means that all characters in + STRING will be skipped when passed to #'skip-chars-forward, + #'skip-chars-backward. + +2008-08-09 Aidan Kehoe <kehoea(a)parhasard.net> + + * subr.el (skip-chars-quote): New. * mule/cyril-util.el: Remove. Use the version in packages instead. diff -r 89406c31b12521258c83389d19f80440a3a7ce48 -r 31344162cf9ada294e34a48c10daa1242b6e310e lisp/subr.el --- a/lisp/subr.el Sat Aug 09 12:13:19 2008 +0200 +++ b/lisp/subr.el Sat Aug 09 13:06:24 2008 +0200 ＠＠ -1722,4 +1722,25 ＠＠ one is kept." ;; define-mail-user-agent is in simple.el. +;; XEmacs; added. +(defun skip-chars-quote (string) + "Return a string that means all characters in STRING will be skipped, +if passed to `skip-chars-forward' or `skip-chars-backward'. + +Ranges and carets are not treated specially. This implementation is +in Lisp; do not use it in performance-critical code." + (let ((list (delete-duplicates (string-to-list string) :test #'=))) + (when (equal list '((?- ?\[) (?\[ ?\-))) + (error 'invalid-argument + "Cannot create `skip-chars-forward' arg from string" + string)) + (when (memq ?\] list) + (setq list (cons ?\] (delq ?\] list)))) + (when (eq ?^ (car list)) + (setq list (nconc (cdr list) '(?^)))) + (when (memq ?- list) + (setq list (delq ?- list) + list (nconc list (list (second list) ?- (second list) ?-)))) + (apply #'string list))) + ;;; subr.el ends here _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

commit: Merge my change of 2008-05-14 to the query-coding-region code. 16 years, 6 months

Aidan Kehoe

changeset: 4559:bd1a68c34d44b7e3a1c38dd107903988cde896d5 user: Aidan Kehoe <kehoea(a)parhasard.net> date: Wed May 21 21:49:19 2008 +0200 files: lisp/ChangeLog lisp/mule/mule-coding.el description: Merge my change of 2008-05-14 to the query-coding-region code. 2008-05-21 Aidan Kehoe <kehoea(a)parhasard.net> * mule/mule-coding.el (make-8-bit-choose-category): Merge my change of 2008-05-14 to the query-coding-region code. diff -r d9fcb5442c9553e6b09f073cd32d803377989218 -r bd1a68c34d44b7e3a1c38dd107903988cde896d5 lisp/ChangeLog --- a/lisp/ChangeLog Wed May 21 21:47:42 2008 +0200 +++ b/lisp/ChangeLog Wed May 21 21:49:19 2008 +0200 ＠＠ -1,3 +1,8 ＠＠ 2008-05-14 Stephen J. Turnbull <stephe +2008-05-21 Aidan Kehoe <kehoea(a)parhasard.net> + + * mule/mule-coding.el (make-8-bit-choose-category): + Merge my change of 2008-05-14 to the query-coding-region code. + 2008-05-14 Stephen J. Turnbull <stephen(a)xemacs.org> * subr.el (add-to-list): Fix Aidan's last commit. diff -r d9fcb5442c9553e6b09f073cd32d803377989218 -r bd1a68c34d44b7e3a1c38dd107903988cde896d5 lisp/mule/mule-coding.el --- a/lisp/mule/mule-coding.el Wed May 21 21:47:42 2008 +0200 +++ b/lisp/mule/mule-coding.el Wed May 21 21:49:19 2008 +0200 ＠＠ -531,7 +531,7 ＠＠ disk to XEmacs characters for some fixed (check-argument-range (length decode-table) #x100 #x100) (loop named category - for i from #x80 to #xBF + for i from #x80 to #x9F do (unless (= i (aref decode-table i)) (return-from category 'no-conversion)) finally return 'iso-8-1)) _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

commit: Add #'query-coding-clear-highlights. 16 years, 6 months

Aidan Kehoe

changeset: 4555:20c32e48923530d9f7b78a89f925a4c6cb93051b user: Aidan Kehoe <kehoea(a)parhasard.net> date: Sun May 11 19:50:10 2008 +0200 files: lisp/ChangeLog lisp/coding.el description: Add #'query-coding-clear-highlights. 2008-05-11 Aidan Kehoe <kehoea(a)parhasard.net> * coding.el (query-coding-clear-highlights): New function--clear any face information added by `query-coding-region'. (default-query-coding-region): Use it. diff -r 4953b7353349ac3293d419461fee6a71cf63c115 -r 20c32e48923530d9f7b78a89f925a4c6cb93051b lisp/ChangeLog --- a/lisp/ChangeLog Sat May 03 13:09:06 2008 +0200 +++ b/lisp/ChangeLog Sun May 11 19:50:10 2008 +0200 ＠＠ -1,3 +1,10 ＠＠ 2008-04-13 Henry S. Thompson <ht(a)inf.ed +2008-05-11 Aidan Kehoe <kehoea(a)parhasard.net> + + * coding.el (query-coding-clear-highlights): + New function--clear any face information added by + `query-coding-region'. + (default-query-coding-region): Use it. + 2008-04-13 Henry S. Thompson <ht(a)inf.ed.ac.uk>, Mike Sperber <mike(a)xemacs.org> * window-xemacs.el (save-window-excursion/mapping, diff -r 4953b7353349ac3293d419461fee6a71cf63c115 -r 20c32e48923530d9f7b78a89f925a4c6cb93051b lisp/coding.el --- a/lisp/coding.el Sat May 03 13:09:06 2008 +0200 +++ b/lisp/coding.el Sun May 11 19:50:10 2008 +0200 ＠＠ -286,6 +286,20 ＠＠ if does not differ from the encoded stri #s(hash-table test equal data ()) "A map from list of charsets to `skip-chars-forward' arguments for them.") +(defsubst query-coding-clear-highlights (begin end &optional buffer) + "Remove extent faces added by `query-coding-region' between BEGIN and END. + +Optional argument BUFFER is the buffer to use, and defaults to the current +buffer. + +The HIGHLIGHTP argument to `query-coding-region' indicates that it should +display unencodable characters using `query-coding-warning-face'. After +this function has been called, this will no longer be the case. " + (map-extents #'(lambda (extent ignored-arg) + (when (eq 'query-coding-warning-face + (extent-face extent)) + (delete-extent extent))) buffer begin end)) + (defun default-query-coding-region (begin end coding-system &optional buffer errorp highlightp) "The default `query-coding-region' implementation. ＠＠ -319,10 +333,7 ＠＠ addition, characters that can be safely safe-charsets "") default-query-coding-region-safe-charset-skip-chars-map))) (when highlightp - (map-extents #'(lambda (extent ignored-arg) - (when (eq 'query-coding-warning-face - (extent-face extent)) - (delete-extent extent))) buffer begin end)) + (query-coding-clear-highlights begin end buffer)) (if (and (zerop (length skip-chars-arg)) (null safe-chars)) (progn ;; Uh-oh, nothing known about this coding system. Fail. ＠＠ -384,7 +395,7 ＠＠ addition, characters that can be safely (values nil ranges) (values t nil)))))) -(defsubst query-coding-region (start end coding-system &optional buffer +(defun query-coding-region (start end coding-system &optional buffer errorp highlight) "Work out whether CODING-SYSTEM can losslessly encode a region. _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

commit: Correct a docstring 16 years, 6 months

Aidan Kehoe

changeset: 4553:75654496fa0e8f757e275d6fe1e64c995636d438 user: Aidan Kehoe <kehoea(a)parhasard.net> date: Sat May 03 13:08:54 2008 +0200 files: lisp/coding.el description: Correct a docstring diff -r 9c1cfceab252316754e6e1e3ee74b1bf2d453443 -r 75654496fa0e8f757e275d6fe1e64c995636d438 lisp/coding.el --- a/lisp/coding.el Thu Mar 13 10:24:34 2008 +0100 +++ b/lisp/coding.el Sat May 03 13:08:54 2008 +0200 ＠＠ -398,7 +398,7 ＠＠ Optional argument HIGHLIGHT says to disp Optional argument HIGHLIGHT says to display unencodable characters in the region using `query-coding-warning-face'. It defaults to nil. -This function returns a list; the intention is that callers use use +This function returns a list; the intention is that callers use `multiple-value-bind' or the related CL multiple value functions to deal with it. The first element is `t' if the string can be encoded using CODING-SYSTEM, or `nil' if not. The second element is `nil' if the string _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

commit: Fix some bugs. 16 years, 6 months

Aidan Kehoe

changeset: 4551:6812571bfcb938954b94aa30c3013e8c02209dea user: Aidan Kehoe <kehoea(a)parhasard.net> date: Thu Mar 13 10:21:01 2008 +0100 files: lisp/coding.el lisp/mule/mule-coding.el lisp/unicode.el description: Fix some bugs. diff -r 1217f19ce196e19eacbf0bf10bb7a3c0e7031b9f -r 6812571bfcb938954b94aa30c3013e8c02209dea lisp/coding.el --- a/lisp/coding.el Mon Jan 21 22:54:43 2008 +0100 +++ b/lisp/coding.el Thu Mar 13 10:21:01 2008 +0100 ＠＠ -299,8 +299,13 ＠＠ addition, characters that can be safely (check-argument-type #'integer-or-marker-p begin) (check-argument-type #'integer-or-marker-p end) (let* ((safe-charsets - (coding-system-get coding-system 'safe-charsets)) - (safe-chars (coding-system-get coding-system 'safe-chars)) + (or (coding-system-get coding-system 'safe-charsets) + (coding-system-get (coding-system-base coding-system) + 'safe-charsets))) + (safe-chars + (or (coding-system-get coding-system 'safe-chars) + (coding-system-get (coding-system-base coding-system) + 'safe-chars))) (skip-chars-arg (gethash safe-charsets default-query-coding-region-safe-charset-skip-chars-map)) ＠＠ -313,6 +318,11 ＠＠ addition, characters that can be safely (mapconcat #'charset-skip-chars-string safe-charsets "") default-query-coding-region-safe-charset-skip-chars-map))) + (when highlightp + (map-extents #'(lambda (extent ignored-arg) + (when (eq 'query-coding-warning-face + (extent-face extent)) + (delete-extent extent))) buffer begin end)) (if (and (zerop (length skip-chars-arg)) (null safe-chars)) (progn ;; Uh-oh, nothing known about this coding system. Fail. diff -r 1217f19ce196e19eacbf0bf10bb7a3c0e7031b9f -r 6812571bfcb938954b94aa30c3013e8c02209dea lisp/mule/mule-coding.el --- a/lisp/mule/mule-coding.el Mon Jan 21 22:54:43 2008 +0100 +++ b/lisp/mule/mule-coding.el Thu Mar 13 10:21:01 2008 +0100 ＠＠ -553,15 +553,24 ＠＠ See that the documentation of `query-cod (check-argument-type #'integer-or-marker-p begin) (check-argument-type #'integer-or-marker-p end) (let ((from-unicode - (coding-system-get coding-system '8-bit-fixed-query-from-unicode)) + (or (coding-system-get coding-system '8-bit-fixed-query-from-unicode) + (coding-system-get (coding-system-base coding-system) + '8-bit-fixed-query-from-unicode))) (skip-chars-arg - (coding-system-get coding-system '8-bit-fixed-query-skip-chars)) + (or (coding-system-get coding-system '8-bit-fixed-query-skip-chars) + (coding-system-get (coding-system-base coding-system) + '8-bit-fixed-query-skip-chars))) (ranges (make-range-table)) char-after fail-range-start fail-range-end previous-fail extent failed) (check-type from-unicode hash-table) (check-type skip-chars-arg string) (save-excursion + (when highlightp + (map-extents #'(lambda (extent ignored-arg) + (when (eq 'query-coding-warning-face + (extent-face extent)) + (delete-extent extent))) buffer begin end)) (goto-char begin buffer) (skip-chars-forward skip-chars-arg end buffer) (while (< (point buffer) end) ＠＠ -588,7 +597,7 ＠＠ See that the documentation of `query-cod (when errorp (error 'text-conversion-error (format "Cannot encode %s using coding system" - (buffer-substring fail-range-start (point buffeR) + (buffer-substring fail-range-start (point buffer) buffer)) (coding-system-name coding-system))) (put-range-table fail-range-start ＠＠ -603,8 +612,8 ＠＠ See that the documentation of `query-cod (setq extent (make-extent fail-range-start fail-range-end buffer)) (set-extent-priority extent (+ mouse-highlight-priority 2)) (set-extent-face extent 'query-coding-warning-face)) - (skip-chars-forward skip-chars-arg end buffer)) - (message "about to give the result, ranges %S" ranges)) + (skip-chars-forward skip-chars-arg end buffer))) + (message "about to give the result, ranges %S" ranges) (if failed (values nil ranges) (values t nil))))) diff -r 1217f19ce196e19eacbf0bf10bb7a3c0e7031b9f -r 6812571bfcb938954b94aa30c3013e8c02209dea lisp/unicode.el --- a/lisp/unicode.el Mon Jan 21 22:54:43 2008 +0100 +++ b/lisp/unicode.el Thu Mar 13 10:21:01 2008 +0100 ＠＠ -624,15 +624,20 ＠＠ mapping from the error sequences to the (let* ((skip-chars-arg unicode-query-coding-skip-chars-arg) (ranges (make-range-table)) (looking-at-arg (concat "[" skip-chars-arg "]")) - fail-range-start fail-range-end previous-fail char-after - failed extent) + fail-range-start fail-range-end previous-fail char-after failed + extent) (save-excursion + (when highlightp + (map-extents #'(lambda (extent ignored-arg) + (when (eq 'query-coding-warning-face + (extent-face extent)) + (delete-extent extent))) buffer begin end)) (goto-char begin buffer) (skip-chars-forward skip-chars-arg end buffer) (while (< (point buffer) end) - (message - "fail-range-start is %S, previous-fail %S, point is %S, end is %S" - fail-range-start previous-fail (point buffer) end) +; (message +; "fail-range-start is %S, previous-fail %S, point is %S, end is %S" +; fail-range-start previous-fail (point buffer) end) (setq char-after (char-after (point buffer) buffer) fail-range-start (point buffer)) (while (and ＠＠ -646,7 +651,7 ＠＠ mapping from the error sequences to the (if (= fail-range-start (point buffer)) ;; The character can actually be encoded by the coding ;; system; check the characters past it. - (forward-char 1 buffer) + (forward-char 1 buffer) ;; Can't be encoded; note this. (when errorp (error 'text-conversion-error _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

commit: First part of interactive checks that coding systems encode regions. 16 years, 6 months

Aidan Kehoe

changeset: 4549:68d1ca56cffadada6a0026664ef87ae7a5b1a0b8 parent: 4402:e70cc8a90e9043075a0e40fe9587d2c931fb618e user: Aidan Kehoe <kehoea(a)parhasard.net> date: Mon Jan 21 22:51:21 2008 +0100 files: lisp/ChangeLog lisp/coding.el lisp/mule/general-late.el lisp/mule/mule-charset.el lisp/mule/mule-coding.el lisp/unicode.el description: First part of interactive checks that coding systems encode regions. 2008-01-21 Aidan Kehoe <kehoea(a)parhasard.net> * coding.el (decode-coding-string): (encode-coding-string): Accept GNU's NOCOPY argument for these. Todo; write compiler macros to use it. (query-coding-warning-face): New face, to show unencodable characters. (default-query-coding-region-safe-charset-skip-chars-map): New variable, a cache used by #'default-query-coding-region. (default-query-coding-region): Default implementation of #'query-coding-region, using the safe-charsets and safe-chars coding systemproperties. (query-coding-region): New function; can a given coding system encode a given region? (query-coding-string): New function; can a given coding system encode a given string? (unencodable-char-position): Function API taken from GNU; return the first unencodable position given a string and coding system. (encode-coding-char): Function API taken from GNU; return CHAR encoded using CODING-SYSTEM, or nil if CODING-SYSTEM would trash CHAR. ((unless (featurep 'mule)): Override the default query-coding-region implementation on non-Mule. * mule/mule-coding.el (make-8-bit-generate-helper): Eliminate a duplicate comment. (make-8-bit-choose-category): Simplify implementation. (8-bit-fixed-query-coding-region): Implementation of #'query-coding-region for coding systems created with #'make-8-bit-coding-system. (make-8-bit-coding-system): Initialise the #'query-coding-region implementation for these character sets. (make-8-bit-coding-system): Ditto for the compiler macro version of this function. * unicode.el (unicode-query-coding-skip-chars-arg): New variable, used by unicode-query-coding-region, initialised in mule/general-late.el. (unicode-query-coding-region): New function, the #'query-coding-region implementation for Unicode coding systems. Initialise the query-coding-function property for the Unicode coding systems to #'unicode-query-coding-region. * mule/mule-charset.el (charset-skip-chars-string): New function. Return a #'skip-chars-forward argument that skips all characters in CHARSET. (map-charset-chars): Function synced from GNU, modified to work with XEmacs. Map FUNC across the int value charset ranges of CHARSET. diff -r e70cc8a90e9043075a0e40fe9587d2c931fb618e -r 68d1ca56cffadada6a0026664ef87ae7a5b1a0b8 lisp/ChangeLog --- a/lisp/ChangeLog Thu Jan 17 11:55:11 2008 +0100 +++ b/lisp/ChangeLog Mon Jan 21 22:51:21 2008 +0100 ＠＠ -1,3 +1,50 ＠＠ 2008-01-17 Mike Sperber <mike(a)xemacs.o +2008-01-21 Aidan Kehoe <kehoea(a)parhasard.net> + + * coding.el (decode-coding-string): + (encode-coding-string): Accept GNU's NOCOPY argument for + these. Todo; write compiler macros to use it. + (query-coding-warning-face): New face, to show unencodable + characters. + (default-query-coding-region-safe-charset-skip-chars-map): + New variable, a cache used by #'default-query-coding-region. + (default-query-coding-region): Default implementation of + #'query-coding-region, using the safe-charsets and safe-chars + coding systemproperties. + (query-coding-region): New function; can a given coding system + encode a given region? + (query-coding-string): New function; can a given coding system + encode a given string? + (unencodable-char-position): Function API taken from GNU; return + the first unencodable position given a string and coding system. + (encode-coding-char): Function API taken from GNU; return CHAR + encoded using CODING-SYSTEM, or nil if CODING-SYSTEM would trash + CHAR. + ((unless (featurep 'mule)): Override the default + query-coding-region implementation on non-Mule. + * mule/mule-coding.el (make-8-bit-generate-helper): Eliminate a + duplicate comment. + (make-8-bit-choose-category): Simplify implementation. + (8-bit-fixed-query-coding-region): Implementation of + #'query-coding-region for coding systems created with + #'make-8-bit-coding-system. + (make-8-bit-coding-system): Initialise the #'query-coding-region + implementation for these character sets. + (make-8-bit-coding-system): Ditto for the compiler macro version + of this function. + * unicode.el (unicode-query-coding-skip-chars-arg): New variable, + used by unicode-query-coding-region, initialised in + mule/general-late.el. + (unicode-query-coding-region): New function, the + #'query-coding-region implementation for Unicode coding systems. + Initialise the query-coding-function property for the Unicode + coding systems to #'unicode-query-coding-region. + * mule/mule-charset.el (charset-skip-chars-string): New + function. Return a #'skip-chars-forward argument that skips all + characters in CHARSET. + (map-charset-chars): Function synced from GNU, modified to work + with XEmacs. Map FUNC across the int value charset ranges of + CHARSET. + 2008-01-17 Mike Sperber <mike(a)xemacs.org> * files.el (insert-directory): Bind `coding-system-for-read' to diff -r e70cc8a90e9043075a0e40fe9587d2c931fb618e -r 68d1ca56cffadada6a0026664ef87ae7a5b1a0b8 lisp/coding.el --- a/lisp/coding.el Thu Jan 17 11:55:11 2008 +0100 +++ b/lisp/coding.el Mon Jan 21 22:51:21 2008 +0100 ＠＠ -125,15 +125,20 ＠＠ from a Lisp program, use `detect-coding- (interactive "r\nP") (princ (detect-coding-region start end))) -(defun decode-coding-string (str coding-system) +(defun decode-coding-string (str coding-system &optional nocopy) "Decode the string STR which is encoded in CODING-SYSTEM. -Does not modify STR. Returns the decoded string on successful conversion." +Normally does not modify STR. Returns the decoded string on +successful conversion. +Optional argument NOCOPY says that modifying STR and returning it is +allowed." (with-string-as-buffer-contents str (decode-coding-region (point-min) (point-max) coding-system))) -(defun encode-coding-string (str coding-system) +(defun encode-coding-string (str coding-system &optional nocopy) "Encode the string STR using CODING-SYSTEM. -Does not modify STR. Returns the encoded string on successful conversion." +Does not modify STR. Returns the encoded string on successful conversion. +Optional argument NOCOPY says that the original string may be returned +if does not differ from the encoded string. " (with-string-as-buffer-contents str (encode-coding-region (point-min) (point-max) coding-system))) ＠＠ -274,4 +279,204 ＠＠ Does not modify STR. Returns the encode (make-compatible-variable 'enable-multibyte-characters "Unimplemented") +;; Sure would be nice to be able to use defface here. +(copy-face 'highlight 'query-coding-warning-face) + +(defvar default-query-coding-region-safe-charset-skip-chars-map + #s(hash-table test equal data ()) + "A map from list of charsets to `skip-chars-forward' arguments for them.") + +(defun default-query-coding-region (begin end coding-system + &optional buffer errorp highlightp) + "The default `query-coding-region' implementation. + +Uses the `safe-charsets' and `safe-chars' coding system properties. +The former is a list of XEmacs character sets that can be safely +encoded by CODING-SYSTEM; the latter a char table describing, in +addition, characters that can be safely encoded by CODING-SYSTEM." + (check-argument-type #'coding-system-p + (setq coding-system (find-coding-system coding-system))) + (check-argument-type #'integer-or-marker-p begin) + (check-argument-type #'integer-or-marker-p end) + (let* ((safe-charsets + (coding-system-get coding-system 'safe-charsets)) + (safe-chars (coding-system-get coding-system 'safe-chars)) + (skip-chars-arg + (gethash safe-charsets + default-query-coding-region-safe-charset-skip-chars-map)) + (ranges (make-range-table)) + fail-range-start fail-range-end previous-fail char-after + looking-at-arg failed extent) + (unless skip-chars-arg + (setq skip-chars-arg + (puthash safe-charsets + (mapconcat #'charset-skip-chars-string + safe-charsets "") + default-query-coding-region-safe-charset-skip-chars-map))) + (if (and (zerop (length skip-chars-arg)) (null safe-chars)) + (progn + ;; Uh-oh, nothing known about this coding system. Fail. + (when errorp + (error 'text-conversion-error + "Coding system doesn't say what it can encode" + (coding-system-name coding-system))) + (put-range-table begin end t ranges) + (when highlightp + (setq extent (make-extent begin end buffer)) + (set-extent-priority extent (+ mouse-highlight-priority 2)) + (set-extent-face extent 'query-coding-warning-face)) + (values nil ranges)) + (setq looking-at-arg (if (equal "" skip-chars-arg) + ;; Regexp that will never match. + #r".\{0,0\}" + (concat "[" skip-chars-arg "]"))) + (save-excursion + (goto-char begin buffer) + (skip-chars-forward skip-chars-arg end buffer) + (while (< (point buffer) end) + (message + "fail-range-start is %S, previous-fail %S, point is %S, end is %S" + fail-range-start previous-fail (point buffer) end) + (setq char-after (char-after (point buffer) buffer) + fail-range-start (point buffer)) + (while (and + (< (point buffer) end) + (not (looking-at looking-at-arg)) + (or (not safe-chars) + (not (get-char-table char-after safe-chars)))) + (forward-char 1 buffer) + (setq char-after (char-after (point buffer) buffer) + failed t)) + (if (= fail-range-start (point buffer)) + ;; The character can actually be encoded by the coding + ;; system; check the characters past it. + (forward-char 1 buffer) + ;; Can't be encoded; note this. + (when errorp + (error 'text-conversion-error + (format "Cannot encode %s using coding system" + (buffer-substring fail-range-start (point buffer) + buffer)) + (coding-system-name coding-system))) + (put-range-table fail-range-start + ;; If char-after is non-nil, we're not at + ;; the end of the buffer. + (setq fail-range-end (if char-after + (point buffer) + (point-max buffer))) + t ranges) + (when highlightp + (setq extent (make-extent fail-range-start fail-range-end buffer)) + (set-extent-priority extent (+ mouse-highlight-priority 2)) + (set-extent-face extent 'query-coding-warning-face))) + (skip-chars-forward skip-chars-arg end buffer)) + (if failed + (values nil ranges) + (values t nil)))))) + +(defsubst query-coding-region (start end coding-system &optional buffer + errorp highlight) + "Work out whether CODING-SYSTEM can losslessly encode a region. + +START and END are the beginning and end of the region to check. +CODING-SYSTEM is the coding system to try. + +Optional argument BUFFER is the buffer to check, and defaults to the current +buffer. Optional argument ERRORP says to signal a `text-conversion-error' +if some character in the region cannot be encoded, and defaults to nil. + +Optional argument HIGHLIGHT says to display unencodable characters in the +region using `query-coding-warning-face'. It defaults to nil. + +This function returns a list; the intention is that callers use use +`multiple-value-bind' or the related CL multiple value functions to deal +with it. The first element is `t' if the string can be encoded using +CODING-SYSTEM, or `nil' if not. The second element is `nil' if the string +can be encoded using CODING-SYSTEM; otherwise, it is a range table +describing the positions of the unencodable characters. See +`make-range-table'." + (funcall (or (coding-system-get coding-system 'query-coding-function) + #'default-query-coding-region) + start end coding-system buffer errorp highlight)) + +(defun query-coding-string (string coding-system &optional errorp highlight) + "Work out whether CODING-SYSTEM can losslessly encode STRING. +CODING-SYSTEM is the coding system to check. + +Optional argument ERRORP says to signal a `text-conversion-error' if some +character in the region cannot be encoded, and defaults to nil. + +Optional argument HIGHLIGHT says to display unencodable characters in the +region using `query-coding-warning-face'. It defaults to nil. + +This function returns a list; the intention is that callers use use +`multiple-value-bind' or the related CL multiple value functions to deal +with it. The first element is `t' if the string can be encoded using +CODING-SYSTEM, or `nil' if not. The second element is `nil' if the string +can be encoded using CODING-SYSTEM; otherwise, it is a range table +describing the positions of the unencodable characters. See +`make-range-table'." + (with-temp-buffer + (insert string) + (query-coding-region (point-min) (point-max) coding-system (current-buffer) + ;; ### Will highlight work here? + errorp highlight))) + +(defun unencodable-char-position (start end coding-system + &optional count string) + "Return position of first un-encodable character in a region. +START and END specify the region and CODING-SYSTEM specifies the +encoding to check. Return nil if CODING-SYSTEM does encode the region. + +If optional 4th argument COUNT is non-nil, it specifies at most how +many un-encodable characters to search. In this case, the value is a +list of positions. + +If optional 5th argument STRING is non-nil, it is a string to search +for un-encodable characters. In that case, START and END are indexes +in the string." + (flet ((thunk () + (multiple-value-bind (result ranges) + (query-coding-region start end coding-system) + (if result + ;; If query-coding-region thinks the entire region is + ;; encodable, result will be t, and the thunk should + ;; return nil, because there are no unencodable + ;; positions in the region. + nil + (if count + (block counted + (map-range-table + #'(lambda (begin end value) + (while (and (<= begin end) (<= begin count)) + (push begin result) + (incf begin)) + (if (> begin count) (return-from counted))) + ranges)) + (map-range-table + #'(lambda (begin end value) + (while (<= begin end) + (push begin result) + (incf begin))) ranges)) + result)))) + (if string + (with-temp-buffer (insert string) (thunk)) + (thunk)))) + +(defun encode-coding-char (char coding-system) + "Encode CHAR by CODING-SYSTEM and return the resulting string. +If CODING-SYSTEM can't safely encode CHAR, return nil." + (check-argument-type #'characterp char) + (multiple-value-bind (succeededp) + (query-coding-string char coding-system) + (when succeededp + (encode-coding-string char coding-system)))) + +(unless (featurep 'mule) + ;; If we're under non-Mule, every XEmacs character can be encoded + ;; with every XEmacs coding system. + (fset #'default-query-coding-region + #'(lambda (&rest ignored) (values t nil))) + (unintern 'default-query-coding-region-safe-charset-skip-chars-map)) + ;;; coding.el ends here diff -r e70cc8a90e9043075a0e40fe9587d2c931fb618e -r 68d1ca56cffadada6a0026664ef87ae7a5b1a0b8 lisp/mule/general-late.el --- a/lisp/mule/general-late.el Thu Jan 17 11:55:11 2008 +0100 +++ b/lisp/mule/general-late.el Mon Jan 21 22:51:21 2008 +0100 ＠＠ -63,7 +63,34 ＠＠ (decode-coding-string Installation-string Installation-file-coding-system) - Installation-string)) + Installation-string) + + ;; Convince the byte compiler that, really, this file can't be encoded + ;; as binary. Ugh. + system-type (symbol-value (intern "\u0073ystem-type")) + + unicode-query-coding-skip-chars-arg + (eval-when-compile + (when-fboundp #'map-charset-chars + (loop + for charset in (charset-list) + with skip-chars-string = "" + do + (block no-ucs-mapping + (map-charset-chars + #'(lambda (begin end) + (loop + while (/= end begin) + do + (when (= -1 (char-to-unicode begin)) + (setq this-charset-works nil) + (return-from no-ucs-mapping)) + (setq begin (int-to-char (1+ begin))))) + charset) + (setq skip-chars-string + (concat skip-chars-string + (charset-skip-chars-string charset)))) + finally return skip-chars-string)))) ;; At this point in the dump, all the charsets have been loaded. Now, load ;; their Unicode mappings. diff -r e70cc8a90e9043075a0e40fe9587d2c931fb618e -r 68d1ca56cffadada6a0026664ef87ae7a5b1a0b8 lisp/mule/mule-charset.el --- a/lisp/mule/mule-charset.el Thu Jan 17 11:55:11 2008 +0100 +++ b/lisp/mule/mule-charset.el Mon Jan 21 22:51:21 2008 +0100 ＠＠ -116,6 +116,65 ＠＠ See `make-charset'." (defun charset-bytes (charset) "Useless in XEmacs, returns 1." 1) + +(defun charset-skip-chars-string (charset) + "Given CHARSET, return a string suitable for for `skip-chars-forward'. +Passing the string to `skip-chars-forward' will cause it to skip all +characters in CHARSET." + (setq charset (get-charset charset)) + (cond + ;; Aargh, the general algorithm doesn't work for these charsets, because + ;; make-char strips the high bit. Hard code them. + ((eq (find-charset 'ascii) charset) "\x00-\x7f") + ((eq (find-charset 'control-1) charset) "\x80-\x9f") + (t + (let (charset-lower charset-upper row-upper row-lower) + (if (= 1 (charset-dimension charset)) + (condition-case args-out-of-range + (make-char charset #x100) + (args-out-of-range + (setq charset-lower (third args-out-of-range) + charset-upper (fourth args-out-of-range)) + (format "%c-%c" + (make-char charset charset-lower) + (make-char charset charset-upper)))) + (condition-case args-out-of-range + (make-char charset #x100 #x22) + (args-out-of-range + (setq row-lower (third args-out-of-range) + row-upper (fourth args-out-of-range)))) + (condition-case args-out-of-range + (make-char charset #x22 #x100) + (args-out-of-range + (setq charset-lower (third args-out-of-range) + charset-upper (fourth args-out-of-range)))) + (format "%c-%c" + (make-char charset row-lower charset-lower) + (make-char charset row-upper charset-upper))))))) +;; From GNU. +(defun map-charset-chars (func charset) + "Use FUNC to map over all characters in CHARSET for side effects. +FUNC is a function of two args, the start and end (inclusive) of a +character code range. Thus FUNC should iterate over [START, END]." + (check-argument-type #'functionp func) + (check-argument-type #'charsetp (setq charset (find-charset charset))) + (let* ((dim (charset-dimension charset)) + (chars (charset-chars charset)) + (start (if (= chars 94) + 33 + 32))) + (if (= dim 1) + (cond + ((eq (find-charset 'ascii) charset) (funcall func ?\x00 ?\x7f)) + ((eq (find-charset 'control-1) charset) (funcall func ?\x80 ?\x9f)) + (t + (funcall func + (make-char charset start) + (make-char charset (+ start chars -1))))) + (dotimes (i chars) + (funcall func + (make-char charset (+ i start) start) + (make-char charset (+ i start) (+ start chars -1))))))) ;;;; Define setf methods for all settable Charset properties diff -r e70cc8a90e9043075a0e40fe9587d2c931fb618e -r 68d1ca56cffadada6a0026664ef87ae7a5b1a0b8 lisp/mule/mule-coding.el --- a/lisp/mule/mule-coding.el Thu Jan 17 11:55:11 2008 +0100 +++ b/lisp/mule/mule-coding.el Mon Jan 21 22:51:21 2008 +0100 ＠＠ -238,8 +238,6 ＠＠ returns a list corresponding to such a c (if (r0 == ,(charset-id 'ascii)) (write r1) ((if (r0 == #xABAB) - ;; #xBFFE is a sentinel in the compiled - ;; program. ;; #xBFFE is a sentinel in the compiled ;; program. ((r0 = r1 & #x7F) ＠＠ -531,12 +529,85 ＠＠ disk to XEmacs characters for some fixed disk to XEmacs characters for some fixed-width 8-bit coding system. " (check-argument-type #'vectorp decode-table) (check-argument-range (length decode-table) #x100 #x100) - (block category - (loop - for i from #x80 to #xBF - do (unless (= i (aref decode-table i)) - (return-from category 'no-conversion))) - 'iso-8-1)) + (loop + named category + for i from #x80 to #xBF + do (unless (= i (aref decode-table i)) + (return-from category 'no-conversion)) + finally return 'iso-8-1)) + +(defun 8-bit-fixed-query-coding-region (begin end coding-system + &optional buffer errorp highlightp) + "The `query-coding-region' implementation for 8-bit-fixed coding systems. + +Uses the `8-bit-fixed-query-from-unicode' and `8-bit-fixed-query-skip-chars' +coding system properties. The former is a hash table mapping from valid +Unicode code points to on-disk octets in the coding system; the latter a set +of characters as used by `skip-chars-forward'. Both of these properties are +generated automatically by `make-8-bit-coding-system'. + +See that the documentation of `query-coding-region'; see also +`make-8-bit-coding-system'. " + (check-argument-type #'coding-system-p + (setq coding-system (find-coding-system coding-system))) + (check-argument-type #'integer-or-marker-p begin) + (check-argument-type #'integer-or-marker-p end) + (let ((from-unicode + (coding-system-get coding-system '8-bit-fixed-query-from-unicode)) + (skip-chars-arg + (coding-system-get coding-system '8-bit-fixed-query-skip-chars)) + (ranges (make-range-table)) + char-after fail-range-start fail-range-end previous-fail extent + failed) + (check-type from-unicode hash-table) + (check-type skip-chars-arg string) + (save-excursion + (goto-char begin buffer) + (skip-chars-forward skip-chars-arg end buffer) + (while (< (point buffer) end) + (message + "fail-range-start is %S, previous-fail %S, point is %S, end is %S" + fail-range-start previous-fail (point buffer) end) + (setq char-after (char-after (point buffer) buffer) + fail-range-start (point buffer)) + (message "arguments are %S %S" + (< (point buffer) end) + (not (gethash (encode-char char-after 'ucs) from-unicode))) + (while (and + (< (point buffer) end) + (not (gethash (encode-char char-after 'ucs) from-unicode))) + (forward-char 1 buffer) + (setq char-after (char-after (point buffer) buffer) + failed t)) + (if (= fail-range-start (point buffer)) + ;; The character can actually be encoded by the coding + ;; system; check the characters past it. + (forward-char 1 buffer) + ;; The character actually failed. + (message "past the move through, point now %S" (point buffer)) + (when errorp + (error 'text-conversion-error + (format "Cannot encode %s using coding system" + (buffer-substring fail-range-start (point buffeR) + buffer)) + (coding-system-name coding-system))) + (put-range-table fail-range-start + ;; If char-after is non-nil, we're not at + ;; the end of the buffer. + (setq fail-range-end (if char-after + (point buffer) + (point-max buffer))) + t ranges) + (when highlightp + (message "highlighting") + (setq extent (make-extent fail-range-start fail-range-end buffer)) + (set-extent-priority extent (+ mouse-highlight-priority 2)) + (set-extent-face extent 'query-coding-warning-face)) + (skip-chars-forward skip-chars-arg end buffer)) + (message "about to give the result, ranges %S" ranges)) + (if failed + (values nil ranges) + (values t nil))))) ;;;###autoload (defun make-8-bit-coding-system (name unicode-map &optional description props) ＠＠ -618,13 +689,27 ＠＠ the code for tilde `~'. " (coding-system-put name '8-bit-fixed t) (coding-system-put name 'category (make-8-bit-choose-category decode-table)) + (coding-system-put name '8-bit-fixed-query-skip-chars + (apply #'string (append decode-table nil))) + (coding-system-put name '8-bit-fixed-query-from-unicode encode-table) + + (coding-system-put name 'query-coding-function + #'8-bit-fixed-query-coding-region) + (coding-system-put (intern (format "%s-unix" name)) + 'query-coding-function + #'8-bit-fixed-query-coding-region) + (coding-system-put (intern (format "%s-dos" name)) + 'query-coding-function + #'8-bit-fixed-query-coding-region) + (coding-system-put (intern (format "%s-mac" name)) + 'query-coding-function + #'8-bit-fixed-query-coding-region) (loop for alias in aliases do (define-coding-system-alias alias name)) result)) (define-compiler-macro make-8-bit-coding-system (&whole form name unicode-map &optional description props) - ;; We provide the compiler macro (= macro that is expanded only on ;; compilation, and that can punt to a runtime version of the ;; associate function if necessary) not for reasons of speed, though ＠＠ -674,8 +759,9 ＠＠ the code for tilde `~'. " ;; (invalid-read-syntax "Multiply defined symbol label" 1) ;; ;; when the file is byte compiled. - (case-fold-search t)) - (define-translation-hash-table encode-table-sym ,encode-table) + (case-fold-search t) + (encode-table ,encode-table)) + (define-translation-hash-table encode-table-sym encode-table) (make-coding-system ',name 'ccl ,description (plist-put (plist-put ',props 'decode ＠＠ -688,8 +774,22 ＠＠ the code for tilde `~'. " (symbol-value 'encode-table-sym))) ',encode-program)))) (coding-system-put ',name '8-bit-fixed t) - (coding-system-put ',name 'category ', - (make-8-bit-choose-category decode-table)) + (coding-system-put ',name 'category + ',(make-8-bit-choose-category decode-table)) + (coding-system-put ',name '8-bit-fixed-query-skip-chars + ',(apply #'string (append decode-table nil))) + (coding-system-put ',name '8-bit-fixed-query-from-unicode encode-table) + (coding-system-put ',name 'query-coding-function + #'8-bit-fixed-query-coding-region) + (coding-system-put ',(intern (format "%s-unix" name)) + 'query-coding-function + #'8-bit-fixed-query-coding-region) + (coding-system-put ',(intern (format "%s-dos" name)) + 'query-coding-function + #'8-bit-fixed-query-coding-region) + (coding-system-put ',(intern (format "%s-mac" name)) + 'query-coding-function + #'8-bit-fixed-query-coding-region) ,(macroexpand `(loop for alias in ',aliases do (define-coding-system-alias alias ',name))) ＠＠ -703,4 +803,3 ＠＠ the code for tilde `~'. " '(mnemonic "Latin 1" documentation "The most used encoding of Western Europe and the Americas." aliases (iso-latin-1 latin-1))) - diff -r e70cc8a90e9043075a0e40fe9587d2c931fb618e -r 68d1ca56cffadada6a0026664ef87ae7a5b1a0b8 lisp/unicode.el --- a/lisp/unicode.el Thu Jan 17 11:55:11 2008 +0100 +++ b/lisp/unicode.el Mon Jan 21 22:51:21 2008 +0100 ＠＠ -611,6 +611,71 ＠＠ mapping from the error sequences to the (translate-region start finish table)) begin end buffer)) +(defvar unicode-query-coding-skip-chars-arg nil ;; Set in general-late.el + "Used by `unicode-query-coding-region' to skip chars with known mappings.") + +(defun unicode-query-coding-region (begin end coding-system + &optional buffer errorp highlightp) + "The `query-coding-region' implementation for Unicode coding systems." + (check-argument-type #'coding-system-p + (setq coding-system (find-coding-system coding-system))) + (check-argument-type #'integer-or-marker-p begin) + (check-argument-type #'integer-or-marker-p end) + (let* ((skip-chars-arg unicode-query-coding-skip-chars-arg) + (ranges (make-range-table)) + (looking-at-arg (concat "[" skip-chars-arg "]")) + fail-range-start fail-range-end previous-fail char-after + failed extent) + (save-excursion + (goto-char begin buffer) + (skip-chars-forward skip-chars-arg end buffer) + (while (< (point buffer) end) + (message + "fail-range-start is %S, previous-fail %S, point is %S, end is %S" + fail-range-start previous-fail (point buffer) end) + (setq char-after (char-after (point buffer) buffer) + fail-range-start (point buffer)) + (while (and + (< (point buffer) end) + (not (looking-at looking-at-arg)) + (= -1 (char-to-unicode char-after))) + (forward-char 1 buffer) + (message "what?!?") + (setq char-after (char-after (point buffer) buffer) + failed t)) + (if (= fail-range-start (point buffer)) + ;; The character can actually be encoded by the coding + ;; system; check the characters past it. + (forward-char 1 buffer) + ;; Can't be encoded; note this. + (when errorp + (error 'text-conversion-error + (format "Cannot encode %s using coding system" + (buffer-substring fail-range-start (point buffer) + buffer)) + (coding-system-name coding-system))) + (put-range-table fail-range-start + ;; If char-after is non-nil, we're not at + ;; the end of the buffer. + (setq fail-range-end (if char-after + (point buffer) + (point-max buffer))) + t ranges) + (when highlightp + (setq extent (make-extent fail-range-start fail-range-end buffer)) + (set-extent-priority extent (+ mouse-highlight-priority 2)) + (set-extent-face extent 'query-coding-warning-face))) + (skip-chars-forward skip-chars-arg end buffer)) + (if failed + (values nil ranges) + (values t nil))))) + +(loop + for coding-system in (coding-system-list) + do (when (eq 'unicode (coding-system-type coding-system)) + (coding-system-put coding-system 'query-coding-function + #'unicode-query-coding-region))) + (unless (featurep 'mule) ;; We do this in such a roundabout way--instead of having the above defun ;; and defvar calls inside a (when (featurep 'mule) ...) form--to have _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

commit: Call XFreeFontPath appropriately in #'x-get-font-path. 16 years, 6 months

Aidan Kehoe

changeset: 4548:b0d2ace4aed140960eda2798889176526a6d4f8e tag: tip user: Aidan Kehoe <kehoea(a)parhasard.net> date: Tue Dec 30 20:01:14 2008 +0000 files: src/ChangeLog src/device-x.c description: Call XFreeFontPath appropriately in #'x-get-font-path. src/ChangeLog addition: 2008-12-30 Aidan Kehoe <kehoea(a)parhasard.net> * device-x.c (Fx_get_font_path): Free the font path once we're finished with it. diff -r ab9e8f0fb295d0d0a41062759eff5d77be652c64 -r b0d2ace4aed140960eda2798889176526a6d4f8e src/ChangeLog --- a/src/ChangeLog Tue Dec 30 16:09:59 2008 +0000 +++ b/src/ChangeLog Tue Dec 30 20:01:14 2008 +0000 ＠＠ -1,3 +1,8 ＠＠ 2008-12-27 Aidan Kehoe <kehoea＠parhasa +2008-12-30 Aidan Kehoe <kehoea(a)parhasard.net> + + * device-x.c (Fx_get_font_path): + Free the font path once we're finished with it. + 2008-12-27 Aidan Kehoe <kehoea(a)parhasard.net> * doc.c (Fbuilt_in_symbol_file): diff -r ab9e8f0fb295d0d0a41062759eff5d77be652c64 -r b0d2ace4aed140960eda2798889176526a6d4f8e src/device-x.c --- a/src/device-x.c Tue Dec 30 16:09:59 2008 +0000 +++ b/src/device-x.c Tue Dec 30 20:01:14 2008 +0000 ＠＠ -2060,6 +2060,8 ＠＠ See also `x-set-font-path'. Qfile_name), font_path); + XFreeFontPath ((char **)directories); + return font_path; } _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

carbon2-commit: Improve glyph image format handling on Carbon. 16 years, 6 months

Aidan Kehoe

changeset: 4586:ce1f49c863bdc7c1806d7c72986cf2387d9ed224 tag: tip user: Aidan Kehoe <kehoea(a)parhasard.net> date: Tue Dec 30 19:04:00 2008 +0000 files: lisp/ChangeLog lisp/dumped-lisp.el lisp/glyphs.el description: Improve glyph image format handling on Carbon. lisp/ChangeLog addition: 2008-12-30 Aidan Kehoe <kehoea(a)parhasard.net> * glyphs.el (init-glyphs): When setting the xemacs-logo and frame-icon-glyph glyph images, use 'win, not 'x for TAG-SET. This makes it work on Carbon (&c.) too. * carbon-glyphs.el: Remove, use the code on the model of the other platforms in glyphs.el instead. * dumped-lisp.el (preloaded-file-list): Remove carbon-glyphs. diff -r 8cffd51dbd8f531d5a15d2f4b4e32ee2d6957d89 -r ce1f49c863bdc7c1806d7c72986cf2387d9ed224 lisp/ChangeLog --- a/lisp/ChangeLog Tue Dec 30 18:47:13 2008 +0000 +++ b/lisp/ChangeLog Tue Dec 30 19:04:00 2008 +0000 ＠＠ -1,3 +1,13 ＠＠ 2008-12-30 Aidan Kehoe <kehoea＠parhasa +2008-12-30 Aidan Kehoe <kehoea(a)parhasard.net> + + * glyphs.el (init-glyphs): + When setting the xemacs-logo and frame-icon-glyph glyph images, + use 'win, not 'x for TAG-SET. This makes it work on Carbon (&c.) + too. + * carbon-glyphs.el: Remove, use the code on the model of the other + platforms in glyphs.el instead. + * dumped-lisp.el (preloaded-file-list): Remove carbon-glyphs. + 2008-12-30 Aidan Kehoe <kehoea(a)parhasard.net> * make-docfile.el: diff -r 8cffd51dbd8f531d5a15d2f4b4e32ee2d6957d89 -r ce1f49c863bdc7c1806d7c72986cf2387d9ed224 lisp/dumped-lisp.el --- a/lisp/dumped-lisp.el Tue Dec 30 18:47:13 2008 +0000 +++ b/lisp/dumped-lisp.el Tue Dec 30 19:04:00 2008 +0000 ＠＠ -273,8 +273,7 ＠＠ in dumped-lisp.el and is not itself list (when (featurep 'tty) "tty-init") ;; preload the Carbon code. (when (featurep 'carbon) - '("carbon-glyphs" - "carbon-toolbar" + '("carbon-toolbar" "carbon-init")) ;;; Formerly in tooltalk/tooltalk-load.el ;; Moved to tooltalk package diff -r 8cffd51dbd8f531d5a15d2f4b4e32ee2d6957d89 -r ce1f49c863bdc7c1806d7c72986cf2387d9ed224 lisp/glyphs.el --- a/lisp/glyphs.el Tue Dec 30 18:47:13 2008 +0000 +++ b/lisp/glyphs.el Tue Dec 30 19:04:00 2008 +0000 ＠＠ -1266,15 +1266,15 ＠＠ If unspecified in a particular domain, ` (if emacs-beta-version "xemacs-beta.xpm" "xemacs.xpm")) - 'global 'x)) + 'global 'win)) (cond ((featurep 'xpm) (set-glyph-image frame-icon-glyph (concat "../etc/" "xemacs-icon.xpm") - 'global 'x)) + 'global 'win)) ((featurep 'x) (set-glyph-image frame-icon-glyph (concat "../etc/" "xemacs-icon2.xbm") - 'global 'x))) + 'global 'win))) (if (featurep 'tty) (set-glyph-image xemacs-logo _______________________________________________ XEmacs-Patches mailing list XEmacs-Patches(a)xemacs.org http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches

1 participants
0 comments

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

XEmacs-Patches December 2008