Ar an naoú lá déag de mí na Nollaig, scríobh Aidan Kehoe:
> My strategy w.r.t. syntax tables was wrong in that patch. What
I
> should have done was copied the syntax from the corresponding Latin 1
> character; and the same should be done for most of the characters in
> european.el.
The below does that (sorry, no ChangeLog yet). I intend moving the content
of european.el to latin.el as the comment suggests before committing it.
Okay, the below moves the content of european.el to latin.el and deletes the
former. It also deals with sundry other language and Mule-related issues; it
updates the Unicode mapping for ISO 8869-7, moves dump-time loading of
Unicode tables to mule/general-late.el, enables broken case support for
Turkish, makes language environment docstrings for auto-generated language
environments more accurate, sets the native-coding-system language
environment property for many more languages, removes the IPA language
environment, fixes a bug with coding-system-property, Unicode coding systems
and the 'type property being overridden.
I’m flying to Ireland in the morning; I’ll get to committing this, in the
event that there are no objections, on the 29th or so. There’s a bzip2’d
version of the patch at the end, to forestall any character encoding
problems with the version included in this mail.
etc/ChangeLog addition:
2006-12-21 Aidan Kehoe <kehoea(a)parhasard.net>
* unicode/unicode-consortium/8859-7.TXT:
Update the mapping to the 2003 version of ISO 8859-7.
lisp/ChangeLog addition:
2006-12-21 Aidan Kehoe <kehoea(a)parhasard.net>
* mule/cyrillic.el:
* mule/cyrillic.el (iso-8859-5):
* mule/cyrillic.el (cyrillic-koi8-r-encode-table):
Add syntax, case support for Cyrillic; make some parentheses more
Lispy.
* mule/european.el:
Content moved to latin.el, file deleted.
* mule/general-late.el:
If Unicode tables are to be loaded at dump time, do it here, not
in loadup.el.
* mule/greek.el:
Add syntax, case support for Greek.
* mule/latin.el:
Move the content of european.el here. Change the case table
mappings to use hexadecimal codes, to make cross reference to the
standards easier. In all cases, take character syntax from similar
characters in Latin-1 , rather than deciding separately what
syntax they should take. Add (incomplete) support for case with
Turkish. Remove description of the character sets used from the
language environments' doc strings, since now that we create
variant language environments on the fly, such descriptions will
often be inaccurate. Set the native-coding-system language info
property while setting the other coding-system properties of the
language.
* mule/misc-lang.el (ipa):
Remove the language environment. The International Phonetic
_Alphabet_ is not a language, it's inane to have a corresponding
language environment in XEmacs.
* mule/mule-cmds.el (create-variant-language-environment):
Also modify the coding-priority when creating a new language
environment; document that.
* mule/mule-cmds.el (get-language-environment-from-locale):
Recognise that the 'native-coding-system language-info property
can be a list, interpret it correctly when it is one.
2006-12-21 Aidan Kehoe <kehoea(a)parhasard.net>
* coding.el (coding-system-category):
Use the new 'unicode-type property for finding what sort of
Unicode coding system subtype a coding system is, instead of the
overshadowed 'type property.
* dumped-lisp.el (preloaded-file-list):
mule/european.el has been removed.
* loadup.el (really-early-error-handler):
Unicode tables loaded at dump time are now in
mule/general-late.el.
* simple.el (count-lines):
Add some backslashes to to parentheses in docstrings to help
fontification along.
* simple.el (what-cursor-position):
Wrap a line to fit in 80 characters.
* unicode.el:
Use the 'unicode-type property, not 'type, for setting the Unicode
coding-system subtype.
src/ChangeLog addition:
2006-12-21 Aidan Kehoe <kehoea(a)parhasard.net>
* file-coding.c:
Update the make-coding-system docstring to reflect unicode-type
* general-slots.h:
New symbol, unicode-type, since 'type was being overridden when
accessing a coding system's Unicode subtype.
* intl-win32.c:
Backslash a few parentheses, to help fontification along.
* intl-win32.c (complex_vars_of_intl_win32):
Use the 'unicode-type symbol, not 'type, when creating the
Microsoft Unicode coding system.
* unicode.c (unicode_putprop):
* unicode.c (unicode_getprop):
* unicode.c (unicode_print):
Using 'type as the property name when working out what Unicode
subtype a given coding system is was broken, since there's a
general coding system property called 'type. Change the former to
use 'unicode-type instead.
XEmacs Trunk source patch:
Diff command: cvs -q diff -Nu
Files affected: src/unicode.c
===================================================================
RCS src/intl-win32.c
===================================================================
RCS src/general-slots.h
===================================================================
RCS src/file-coding.c
===================================================================
RCS lisp/mule/mule-cmds.el
===================================================================
RCS lisp/mule/misc-lang.el
===================================================================
RCS lisp/mule/latin.el
===================================================================
RCS lisp/mule/greek.el
===================================================================
RCS lisp/mule/general-late.el
===================================================================
RCS lisp/mule/european.el
===================================================================
RCS lisp/mule/cyrillic.el
===================================================================
RCS lisp/unicode.el
===================================================================
RCS lisp/simple.el
===================================================================
RCS lisp/loadup.el
===================================================================
RCS lisp/dumped-lisp.el
===================================================================
RCS lisp/coding.el
===================================================================
RCS etc/unicode/unicode-consortium/8859-7.TXT
===================================================================
RCS
Index: etc/unicode/unicode-consortium/8859-7.TXT
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/etc/unicode/unicode-consortium/8859-7.TXT,v
retrieving revision 1.2
diff -u -u -r1.2 8859-7.TXT
--- etc/unicode/unicode-consortium/8859-7.TXT 2002/03/13 08:51:40 1.2
+++ etc/unicode/unicode-consortium/8859-7.TXT 2006/12/21 22:56:51
@@ -1,12 +1,12 @@
#
-# Name: ISO 8859-7:1987 to Unicode
-# Unicode version: 3.0
-# Table version: 1.0
+# Name: ISO 8859-7:2003 to Unicode
+# Unicode version: 4.0
+# Table version: 2.0
# Table format: Format A
-# Date: 1999 July 27
+# Date: 2003-Nov-12
# Authors: Ken Whistler <kenw(a)sybase.com>
#
-# Copyright (c) 1991-1999 Unicode, Inc. All Rights reserved.
+# Copyright (c) 1991-2003 Unicode, Inc. All Rights reserved.
#
# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
# No claims are made as to fitness for any particular purpose. No
@@ -25,10 +25,11 @@
# General notes:
#
# This table contains the data the Unicode Consortium has on how
-# ISO 8859-7:1987 characters map into Unicode.
+# ISO 8859-7:2003 characters map into Unicode.
#
# ISO 8859-7:1987 is equivalent to ISO-IR-126, ELOT 928,
-# and ECMA 118.
+# and ECMA 118. ISO 8859-7:2003 adds two currency signs
+# and one other character not in the earlier standard.
#
# Format: Three tab-separated columns
# Column #1 is the ISO 8859-7 code (in hex as 0xXX)
@@ -43,12 +44,14 @@
# Remap 0xA1 to U+2018 (instead of 0x02BD) to match text of 8859-7
# Remap 0xA2 to U+2019 (instead of 0x02BC) to match text of 8859-7
#
+# 2.0 version updates 1.0 version by adding mappings for the
+# three newly added characters 0xA4, 0xA5, 0xAA.
+#
# Updated versions of this file may be found in:
-# <
ftp://ftp.unicode.org/Public/MAPPINGS/>
+# <
http://www.unicode.org/Public/MAPPINGS/>
#
-# Any comments or problems, contact <errata(a)unicode.org>
-# Please note that <errata(a)unicode.org> is an archival address;
-# notices will be checked, but do not expect an immediate response.
+# Any comments or problems, contact the Unicode Consortium at:
+# <
http://www.unicode.org/reporting.html>
#
0x00 0x0000 # NULL
0x01 0x0001 # START OF HEADING
@@ -214,10 +217,13 @@
0xA1 0x2018 # LEFT SINGLE QUOTATION MARK
0xA2 0x2019 # RIGHT SINGLE QUOTATION MARK
0xA3 0x00A3 # POUND SIGN
+0xA4 0x20AC # EURO SIGN
+0xA5 0x20AF # DRACHMA SIGN
0xA6 0x00A6 # BROKEN BAR
0xA7 0x00A7 # SECTION SIGN
0xA8 0x00A8 # DIAERESIS
0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x037A # GREEK YPOGEGRAMMENI
0xAB 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
0xAC 0x00AC # NOT SIGN
0xAD 0x00AD # SOFT HYPHEN
Index: lisp/coding.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/coding.el,v
retrieving revision 1.10
diff -u -u -r1.10 coding.el
--- lisp/coding.el 2002/09/01 06:41:41 1.10
+++ lisp/coding.el 2006/12/21 22:56:51
@@ -200,7 +200,7 @@
(case (coding-system-type coding-system)
(no-conversion 'no-conversion)
(shift-jis 'shift-jis)
- (unicode (case (coding-system-property coding-system 'type)
+ (unicode (case (coding-system-property coding-system 'unicode-type)
(utf-8 (let ((bom (coding-system-property coding-system
'need-bom)))
(cond (bom 'utf-8-bom)
Index: lisp/dumped-lisp.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/dumped-lisp.el,v
retrieving revision 1.60
diff -u -u -r1.60 dumped-lisp.el
--- lisp/dumped-lisp.el 2006/11/28 21:20:23 1.60
+++ lisp/dumped-lisp.el 2006/12/21 22:56:52
@@ -205,7 +205,6 @@
"mule/cyrillic"
"mule/english"
"mule/ethiopic"
- "mule/european"
"mule/greek"
"mule/hebrew"
"mule/indian"
Index: lisp/loadup.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/loadup.el,v
retrieving revision 1.32
diff -u -u -r1.32 loadup.el
--- lisp/loadup.el 2006/07/16 12:23:58 1.32
+++ lisp/loadup.el 2006/12/21 22:56:52
@@ -171,13 +171,6 @@
(defun toolbar-specifier-p (obj) "No toolbar support." nil))
(fmakunbound 'pureload))
- ;; We cannot do this in mule-cmds.el because not all the
- ;; appropriate charsets are loaded yet.
- (when (and (featurep 'mule)
- load-unicode-tables-at-dump-time)
- (let ((data-directory (expand-file-name "etc" source-directory)))
- (load-unicode-tables)))
-
(packages-load-package-dumped-lisps late-package-load-path)
)) ;; end of call-with-condition-handler
Index: lisp/simple.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/simple.el,v
retrieving revision 1.57
diff -u -u -r1.57 simple.el
--- lisp/simple.el 2006/12/06 21:28:48 1.57
+++ lisp/simple.el 2006/12/21 22:56:54
@@ -761,7 +761,7 @@
NOTE: The expression to return the current line number is not obvious:
-(1+ (count-lines 1 (point-at-bol)))
+\(1+ \(count-lines 1 \(point-at-bol)))
See also `line-number'."
(save-excursion
@@ -821,7 +821,8 @@
percent narrowed-details col hscroll)
(message "Char: %s (%s %s) point=%d of %d(%d%%)%s column %d %s"
(text-char-description char) unicode-string
- (mapconcat (lambda (arg) (format "%S" arg)) (split-char char)
" ")
+ (mapconcat (lambda (arg) (format "%S" arg))
+ (split-char char) " ")
pos total
percent narrowed-details col hscroll)))))
Index: lisp/unicode.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/unicode.el,v
retrieving revision 1.18
diff -u -u -r1.18 unicode.el
--- lisp/unicode.el 2006/11/07 18:51:22 1.18
+++ lisp/unicode.el 2006/12/21 22:56:55
@@ -148,7 +148,7 @@
'utf-16 'unicode
"UTF-16"
'(mnemonic "UTF-16"
- documentation
+ documentation
"UTF-16 Unicode encoding -- the standard (almost-) fixed-width
two-byte encoding, with surrogates. It will be fixed-width if all
characters are in the BMP (Basic Multilingual Plane -- first 65536
@@ -156,7 +156,7 @@
0x10FFFF (a little more than 1,000,000). Unicode and ISO guarantee
never to encode any characters outside this range -- all the rest are
for private, corporate or internal use."
- type utf-16))
+ unicode-type utf-16))
(define-coding-system-alias 'utf-16-be 'utf-16)
@@ -164,7 +164,7 @@
'utf-16-bom 'unicode
"UTF-16 w/BOM"
'(mnemonic "UTF16-BOM"
- documentation
+ documentation
"UTF-16 Unicode encoding with byte order mark (BOM) at the beginning.
The BOM is Unicode character U+FEFF -- i.e. the first two bytes are
0xFE and 0xFF, respectively, or reversed in a little-endian
@@ -184,7 +184,7 @@
This coding system will insert a BOM at the beginning of a stream when
writing and strip it off when reading."
- type utf-16
+ unicode-type utf-16
need-bom t))
(make-coding-system
@@ -194,7 +194,7 @@
documentation
"Little-endian version of UTF-16 Unicode encoding.
See `utf-16' coding system."
- type utf-16
+ unicode-type utf-16
little-endian t))
(define-coding-system-alias 'utf-16-le 'utf-16-little-endian)
@@ -207,7 +207,7 @@
"Little-endian version of UTF-16 Unicode encoding, with byte order mark.
Standard encoding for representing Unicode under MS Windows. See
`utf-16-bom' coding system."
- type utf-16
+ unicode-type utf-16
little-endian t
need-bom t))
@@ -217,7 +217,7 @@
'(mnemonic "UCS4"
documentation
"UCS-4 Unicode encoding -- fully fixed-width four-byte encoding."
- type ucs-4))
+ unicode-type ucs-4))
(make-coding-system
'ucs-4-little-endian 'unicode
@@ -227,15 +227,15 @@
;; #### I don't think this is permitted by ISO 10646, only Unicode.
;; Call it UTF-32 instead?
"Little-endian version of UCS-4 Unicode encoding. See `ucs-4' coding
system."
- type ucs-4
+ unicode-type ucs-4
little-endian t))
(make-coding-system
'utf-8 'unicode
"UTF-8"
'(mnemonic "UTF8"
- documentation
- "UTF-8 Unicode encoding -- ASCII-compatible 8-bit variable-width encoding
+ documentation "
+UTF-8 Unicode encoding -- ASCII-compatible 8-bit variable-width encoding
sharing the following principles with the Mule-internal encoding:
-- All ASCII characters (codepoints 0 through 127) are represented
@@ -256,7 +256,7 @@
-- Given only the leading byte, you know how many following bytes
are present.
"
- type utf-8))
+ unicode-type utf-8))
(make-coding-system
'utf-8-bom 'unicode
@@ -265,7 +265,7 @@
documentation
"UTF-8 Unicode encoding, with byte order mark.
Standard encoding for representing UTF-8 under MS Windows."
- type utf-8
+ unicode-type utf-8
little-endian t
need-bom t))
@@ -344,4 +344,4 @@
; For more information, see Appendix A.1 of The Unicode Standard 2.0, or
; wherever it is in v3.0."
-; type utf-7))
+; unicode-type utf-7))
Index: lisp/mule/cyrillic.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/mule/cyrillic.el,v
retrieving revision 1.12
diff -u -u -r1.12 cyrillic.el
--- lisp/mule/cyrillic.el 2006/12/17 13:41:49 1.12
+++ lisp/mule/cyrillic.el 2006/12/21 22:56:55
@@ -29,16 +29,19 @@
;; The character set ISO8859-5 is supported. KOI-8 and ALTERNATIVNYJ are
;; converted to ISO8859-5 internally.
-;; Windows-1251 support deleted because XEmacs has automatic support.
+;; [Windows-1251 support deleted because XEmacs has automatic support.]
-;;; Code:
-
-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
-;;; CYRILLIC
-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; #### We only have automatic support on Windows; that needs to be put
+;; back. Also, the Russian Wikipedia articles on KOI-8 list several other
+;; related encodings--KOI8-U (Ukrainian), KOI8-RU (simultaneous support for
+;; Russian, Belorussian, and Ukrainian), KOI8-C (for languages of the
+;; Caucasus), KOI8-O (Old Church Slavonic)--and it would be nice to have
+;; them. Beyond that, we're currently trashing lots of code points with
+;; KOI-8 R; it would be nice to leverage the Unicode support to not do that.
-;; ISO-8859-5
+;;; Code:
+;; Case table:
(loop
for (upper lower)
in '((#xcf #xef) ; YA
@@ -94,14 +97,22 @@
case-table))
;; The default character syntax is now word. Pay attention to the
-;; exceptions in ISO-8859-5.
-(dolist (code '(#xAD ;; SOFT HYPHEN
- #xF0 ;; NUMERO SIGN
- #xFD)) ;; SECTION SIGN
- (modify-syntax-entry (make-char 'cyrillic-iso8859-5 code) "."))
-
-;; NO-BREAK SPACE
-(modify-syntax-entry (make-char 'cyrillic-iso8859-5 #xA0) " ")
+;; exceptions in ISO-8859-5, copying them from ISO-8859-1.
+(loop
+ for (latin-1 cyrillic)
+ in '((#xAD #xAD) ;; SOFT HYPHEN
+ (#xA7 #xFD) ;; SECTION SIGN
+ (#xA0 #xA0)) ;; NO BREAK SPACE
+ with syntax-table = (standard-syntax-table)
+ do (modify-syntax-entry
+ (make-char 'cyrillic-iso8859-5 cyrillic)
+ (string (char-syntax (make-char 'latin-iso8859-1 latin-1)))
+ syntax-table))
+
+;; Take NUMERO SIGN's syntax from #.
+(modify-syntax-entry (make-char 'cyrillic-iso8859-5 #xF0)
+ (string (char-syntax ?\# (standard-syntax-table)))
+ (standard-syntax-table))
(make-coding-system
'iso-8859-5 'iso2022
@@ -110,8 +121,7 @@
charset-g1 cyrillic-iso8859-5
charset-g2 t
charset-g3 t
- mnemonic "ISO8/Cyr"
- ))
+ mnemonic "ISO8/Cyr"))
(set-language-info-alist
"Cyrillic-ISO" '((charset cyrillic-iso8859-5)
@@ -155,12 +165,10 @@
(let* ((ch (aref cyrillic-koi8-r-decode-table i))
(split (split-char ch)))
(cond ((eq (car split) 'cyrillic-iso8859-5)
- (aset table (logior (nth 1 split) 128) i)
- )
+ (aset table (logior (nth 1 split) 128) i))
((eq ch 32))
((eq (car split) 'ascii)
- (aset table ch i)
- )))
+ (aset table ch i))))
(setq i (1+ i)))
table)
"Cyrillic KOI8-R encoding table.")
Index: lisp/mule/european.el
===================================================================
RCS file: european.el
diff -N european.el
--- /tmp/cvsHAAijaOVg Thu Dec 21 23:57:04 2006
+++ /dev/null Thu Dec 21 23:56:58 2006
@@ -1,541 +0,0 @@
-;;; european.el --- European languages -*- coding: iso-2022-7bit; -*-
-
-;; Copyright (C) 1995 Electrotechnical Laboratory, JAPAN.
-;; Licensed to the Free Software Foundation.
-;; Copyright (C) 1997 MORIOKA Tomohiko
-;; Copyright (C) 2001 Ben Wing.
-;; Copyright (C) 2002, 2005 Free Software Foundation
-
-;; Keywords: multilingual, European
-
-;; This file is part of XEmacs.
-
-;; XEmacs is free software; you can redistribute it and/or modify it
-;; under the terms of the GNU General Public License as published by
-;; the Free Software Foundation; either version 2, or (at your option)
-;; any later version.
-
-;; XEmacs is distributed in the hope that it will be useful, but
-;; WITHOUT ANY WARRANTY; without even the implied warranty of
-;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
-;; General Public License for more details.
-
-;; You should have received a copy of the GNU General Public License
-;; along with XEmacs; see the file COPYING. If not, write to the Free
-;; Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
-;; 02111-1307, USA.
-
-;;; Commentary:
-
-;; For Roman-alphabet-using Europeans, eight coded character sets,
-;; ISO8859-1,2,3,4,9,14,15,16 are supported.
-
-;; #### latin.el would be a better name for this file.
-
-;;; Code:
-; (make-charset 'latin-iso8859-1
-; "Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100"
-; '(dimension
-; 1
-; registry "ISO8859-1"
-; chars 96
-; columns 1
-; direction l2r
-; final ?A
-; graphic 1
-; short-name "RHP of Latin-1"
-; long-name "RHP of Latin-1 (ISO 8859-1): ISO-IR-100"
-; ))
-
-; (make-charset 'latin-iso8859-2
-; "Right-Hand Part of Latin Alphabet 2 (ISO/IEC 8859-2): ISO-IR-101"
-; '(dimension
-; 1
-; registry "ISO8859-2"
-; chars 96
-; columns 1
-; direction l2r
-; final ?B
-; graphic 1
-; short-name "RHP of Latin-2"
-; long-name "RHP of Latin-2 (ISO 8859-2): ISO-IR-101"
-; ))
-
-; (make-charset 'latin-iso8859-3
-; "Right-Hand Part of Latin Alphabet 3 (ISO/IEC 8859-3): ISO-IR-109"
-; '(dimension
-; 1
-; registry "ISO8859-3"
-; chars 96
-; columns 1
-; direction l2r
-; final ?C
-; graphic 1
-; short-name "RHP of Latin-3"
-; long-name "RHP of Latin-3 (ISO 8859-3): ISO-IR-109"
-; ))
-
-; (make-charset 'latin-iso8859-4
-; "Right-Hand Part of Latin Alphabet 4 (ISO/IEC 8859-4): ISO-IR-110"
-; '(dimension
-; 1
-; registry "ISO8859-4"
-; chars 96
-; columns 1
-; direction l2r
-; final ?D
-; graphic 1
-; short-name "RHP of Latin-4"
-; long-name "RHP of Latin-4 (ISO 8859-4): ISO-IR-110"
-; ))
-
-; (make-charset 'latin-iso8859-9
-; "Right-Hand Part of Latin Alphabet 5 (ISO/IEC 8859-9): ISO-IR-148"
-; '(dimension
-; 1
-; registry "ISO8859-9"
-; chars 96
-; columns 1
-; direction l2r
-; final ?M
-; graphic 1
-; short-name "RHP of Latin-5"
-; long-name "RHP of Latin-5 (ISO 8859-9): ISO-IR-148"
-; ))
-
-; (make-charset 'latin-iso8859-15
-; "Right-Hand Part of Latin Alphabet 9 (ISO/IEC 8859-15): ISO-IR-203"
-; '(dimension
-; 1
-; registry "ISO8859-15"
-; chars 96
-; columns 1
-; direction l2r
-; final ?b
-; graphic 1
-; short-name "RHP of Latin-9"
-; long-name "RHP of Latin-9 (ISO 8859-15): ISO-IR-203"
-; ))
-
-(make-charset 'latin-iso8859-14
- "Right-Hand Part of Latin Alphabet 8 (ISO/IEC 8859-14)"
- '(dimension
- 1
- registries ["ISO8859-14"]
- chars 96
- columns 1
- direction l2r
- final ?_
- graphic 1
- short-name "RHP of Latin-8"
- long-name "RHP of Latin-8 (ISO 8859-14)"
- ))
-
-(make-charset 'latin-iso8859-16
- "Right-Hand Part of Latin Alphabet 10 (ISO/IEC 8859-16)"
- '(dimension
- 1
- registries ["ISO8859-16"]
- chars 96
- columns 1
- direction l2r
- final ?f ; octet 06/06; cf ISO-IR 226
- graphic 1
- short-name "RHP of Latin-10"
- long-name "RHP of Latin-10 (ISO 8859-16)"
- ))
-
-;; Latin-1 is dealt with in iso8859-1.el, which see.
-
-;; ISO 8859-14.
-;;
-;; Initialise all characters to word syntax.
-(loop for c from #xa0 to #xff
- do (modify-syntax-entry (make-char 'latin-iso8859-14 c) "w"))
-
-;; Now, the exceptions. There's just punctuation in this character set.
-(dolist (code '(#xa0 ;; NO BREAK SPACE
- #xa3 ;; POUND SIGN
- #xa7 ;; SECTION SIGN
- #xa9 ;; COPYRIGHT
- #xad ;; SOFT HYPHEN
- #xae ;; REGISTERED
- #xb6)) ;; PILCROW SIGN
- (modify-syntax-entry (make-char 'latin-iso8859-14 code) "_"))
-;; end of ISO 8859-14.
-
-;; ISO 8859-16.
-;;
-;; Initialise all of iso-8859-16 to word syntax.
-(loop for c from #xa0 to #xff
- do (modify-syntax-entry (make-char 'latin-iso8859-16 c) "w"))
-
-;; And then do the exceptions. First, the punctuation (following the model
-;; of Latin-1):
-(dolist (code '(#xa0 ;; NO BREAK SPACE
- #xa4 ;; EURO SIGN
- #xa7 ;; SECTION SIGN
- #xa9 ;; COPYRIGHT
- #xad ;; SOFT HYPHEN
- #xb0 ;; DEGREE
- #xb1 ;; PLUS-MINUS SIGN
- #xb6 ;; PILCROW SIGN
- #xb7)) ;; MIDDLE DOT
- (modify-syntax-entry (make-char 'latin-iso8859-16 code) "_"))
-
-;; Mark the DOUBLE LOW-9 QUOTATION MARK and its closing character as
-;; quotation marks.
-(modify-syntax-entry (make-char 'latin-iso8859-16 #xa5) "\"")
-(modify-syntax-entry (make-char 'latin-iso8859-16 #xb5) "\"")
-
-;; For some crazy reason--well, in truth, probably because Jamie never used
-;; them in anger--the guillemets have open- and close-parenthesis syntax in
-;; Latin 1. We will probably change that in the future; for the moment, I'm
-;; preserving it.
-(modify-syntax-entry (make-char 'latin-iso8859-16 #xab)
- (format "(%c" (make-char 'latin-iso8859-16 #xbb)))
-(modify-syntax-entry (make-char 'latin-iso8859-16 #xbb)
- (format ")%c" (make-char 'latin-iso8859-16 #xab)))
-
-;; end of ISO 8859-16.
-
-;; ISO 8859-15.
-;;
-;; Based on Latin-1 and differences therefrom.
-;;
-;; First, initialise the syntax from the corresponding Latin-1 characters.
-(loop for c from #xa0 to #xff
- do (modify-syntax-entry
- (make-char 'latin-iso8859-15 c)
- (string (char-syntax (make-char 'latin-iso8859-1 c)))))
-;; Now, the exceptions
-(loop for c in '(?,b&(B ?,b((B ?,b4(B ?,b8(B ?,b<(B ?,b=(B
?,b>(B)
- do (modify-syntax-entry c "w"))
-
-;; Again, perpetuating insanity with the guillemets.
-(modify-syntax-entry (make-char 'latin-iso8859-16 #xab)
- (format "(%c" (make-char 'latin-iso8859-16 #xbb)))
-(modify-syntax-entry (make-char 'latin-iso8859-16 #xbb)
- (format ")%c" (make-char 'latin-iso8859-16 #xab)))
-;; end of ISO 8859-15.
-
-;; For syntax of Latin-2
-(loop for c in '(?,B!(B ?,B#(B ?,B%(B ?,B&(B ?,B)(B ?,B*(B ?,B+(B
?,B,(B ?,B.(B ?,B/(B ?,B1(B ?,B3(B ?,B5(B ?,B6(B ?,B9(B ?,B:(B ?,B;(B
?,B<(B)
- do (modify-syntax-entry c "w"))
-
-(loop for c from 62 to 126
- do (modify-syntax-entry (make-char 'latin-iso8859-2 c) "w"))
-
-(modify-syntax-entry (make-char 'latin-iso8859-2 32) "w") ; no-break space
-(modify-syntax-entry ?,BW(B ".")
-(modify-syntax-entry ?,Bw(B ".")
-
-;; For syntax of Latin-3
-(loop for c in '(?,C!(B ?,C&(B ?,C)(B ?,C*(B ?,C+(B ?,C,(B ?,C/(B
?,C1(B ?,C5(B ?,C6(B ?,C:(B ?,C;(B ?,C<(B ?,C?(B)
- do (modify-syntax-entry c "w"))
-
-(loop for c from 64 to 126
- do (modify-syntax-entry (make-char 'latin-iso8859-3 c) "w"))
-
-(modify-syntax-entry (make-char 'latin-iso8859-3 32) "w") ; no-break space
-(modify-syntax-entry ?,CW(B ".")
-(modify-syntax-entry ?,Cw(B ".")
-
-;; For syntax of Latin-4
-(loop for c in '(?,D!(B ?,D"(B ?,D#(B ?,D%(B ?,D&(B ?,D)(B
?,D*(B ?,D+(B ?,D,(B ?,D.(B ?,D1(B ?,D3(B ?,D5(B ?,D6(B ?,D9(B ?,D:(B
?,D;(B ?,D<(B ?,D=(B ?,D>(B ?,D?(B)
- do (modify-syntax-entry c "w"))
-
-(loop for c from 64 to 126
- do (modify-syntax-entry (make-char 'latin-iso8859-4 c) "w"))
-
-(modify-syntax-entry (make-char 'latin-iso8859-4 32) "w") ; no-break space
-(modify-syntax-entry ?,DW(B ".")
-(modify-syntax-entry ?,Dw(B ".")
-
-
-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
-;;; EUROPEANS
-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
-
-
-;; Latin-1 (ISO-8859-1)
-
-;; (make-coding-system
-;; 'iso-latin-1 2 ?1
-;; "ISO 2022 based 8-bit encoding for Latin-1 (MIME:ISO-8859-1)"
-;; '(ascii latin-iso8859-1 nil nil
-;; nil nil nil nil nil nil nil nil nil nil nil nil t)
-;; '((safe-charsets ascii latin-iso8859-1)
-;; (mime-charset . iso-8859-1)))
-
-;; (define-coding-system-alias 'iso-8859-1 'iso-latin-1)
-;; (define-coding-system-alias 'latin-1 'iso-latin-1)
-
-;; (make-coding-system
-;; 'compound-text 2 ?1
-;; "ISO 2022 based encoding used in inter client communication of X"
-;; '((ascii t) (latin-iso8859-1 t) nil nil
-;; nil ascii-eol ascii-cntl nil nil nil nil nil nil nil nil nil t)
-;; '((safe-charsets . t)))
-
-;; (define-coding-system-alias 'ctext 'compound-text)
-
-;; "Hello, Hej, Tere, Hei, Bonjour, Gr,A|_(B Gott, Ciao, ,A!(BHola!"
-
-
-;; Latin-9 (ISO-8859-15)
-;; Latin-1 plus Euro, plus a few accented characters
-
-;; (make-charset 'latin-iso8859-15
-;; "Latin-9, aka Latin-1 with Euro etc"
-;; '(short-name "Latin 9"
-;; long-name "Latin-9 (typically GR of ISO 8859/15)"
-;; registry "iso8859-15"
-;; dimension 1
-;; columns 1
-;; chars 96
-;; final ?b ; ISO-IR-203
-;; graphic 1
-;; direction l2r))
-
-(make-coding-system
- 'iso-8859-15 'iso2022
- "ISO 4873 conforming 8-bit code (ASCII + Latin 9; aka Latin-1 with Euro)"
- `(mnemonic "MIME/Ltn-9" ; bletch
- eol-type nil
- charset-g0 ascii
- charset-g1 latin-iso8859-15
- charset-g2 t
- charset-g3 t
- ))
-
-
-;; Latin-2 (ISO-8859-2)
-
-;; (make-coding-system
-;; 'iso-latin-2 2 ?2
-;; "ISO 2022 based 8-bit encoding (MIME:ISO-8859-2)"
-;; '(ascii latin-iso8859-2 nil nil
-;; nil nil nil nil nil nil nil)
-;; '((safe-charsets ascii latin-iso8859-2)
-;; (mime-charset . iso-8859-2)))
-
-;; (define-coding-system-alias 'iso-8859-2 'iso-latin-2)
-;; (define-coding-system-alias 'latin-2 'iso-latin-2)
-
-(make-coding-system
- 'iso-8859-2 'iso2022 "ISO-8859-2 (Latin-2)"
- '(charset-g0 ascii
- charset-g1 latin-iso8859-2
- charset-g2 t
- charset-g3 t
- mnemonic "MIME/Ltn-2"
- ))
-
-(provide 'romanian)
-
-;; Czech support originally from czech.el
-;; Author: Milan Zamazal <pdm(a)zamazal.org>
-;; Maintainer (FSF): Pavel Jan,Am(Bk <Pavel(a)Janik.cz>
-;; Maintainer (for XEmacs): David Sauer <davids(a)penguin.cz>
-
-(provide 'czech)
-
-;; Slovak support originally from slovak.el
-;; Authors: Tibor ,B)(Bimko <tibor.simko(a)fmph.uniba.sk>,
-;; Milan Zamazal <pdm(a)fi.muni.cz>
-;; Maintainer: Milan Zamazal <pdm(a)fi.muni.cz>
-
-(provide 'slovenian)
-
-
-;; Latin-3 (ISO-8859-3)
-
-;; (make-coding-system
-;; 'iso-latin-3 2 ?3
-;; "ISO 2022 based 8-bit encoding (MIME:ISO-8859-3)"
-;; '(ascii latin-iso8859-3 nil nil
-;; nil nil nil nil nil nil nil)
-;; '((safe-charsets ascii latin-iso8859-3)
-;; (mime-charset . iso-8859-3)))
-
-;; (define-coding-system-alias 'iso-8859-3 'iso-latin-3)
-;; (define-coding-system-alias 'latin-3 'iso-latin-3)
-
-(make-coding-system
- 'iso-8859-3 'iso2022 "ISO-8859-3 (Latin-3)"
- '(charset-g0 ascii
- charset-g1 latin-iso8859-3
- charset-g2 t
- charset-g3 t
- mnemonic "MIME/Ltn-3"
- ))
-
-
-;; Latin-4 (ISO-8859-4)
-
-;; (make-coding-system
-;; 'iso-latin-4 2 ?4
-;; "ISO 2022 based 8-bit encoding (MIME:ISO-8859-4)"
-;; '(ascii latin-iso8859-4 nil nil
-;; nil nil nil nil nil nil nil)
-;; '((safe-charsets ascii latin-iso8859-4)
-;; (mime-charset . iso-8895-4)))
-
-;; (define-coding-system-alias 'iso-8859-4 'iso-latin-4)
-;; (define-coding-system-alias 'latin-4 'iso-latin-4)
-
-(make-coding-system
- 'iso-8859-4 'iso2022 "ISO-8859-4 (Latin-4)"
- '(charset-g0 ascii
- charset-g1 latin-iso8859-4
- charset-g2 t
- charset-g3 t
- mnemonic "MIME/Ltn-4"
- ))
-
-
-;; Latin-5 (ISO-8859-9)
-
-;; (make-coding-system
-;; 'iso-latin-5 2 ?9
-;; "ISO 2022 based 8-bit encoding (MIME:ISO-8859-9)"
-;; '(ascii latin-iso8859-9 nil nil
-;; nil nil nil nil nil nil nil)
-;; '((safe-charsets ascii latin-iso8859-9)
-;; (mime-charset . iso-8859-9)))
-
-;; (define-coding-system-alias 'iso-8859-9 'iso-latin-5)
-;; (define-coding-system-alias 'latin-5 'iso-latin-5)
-
-(make-coding-system
- 'iso-8859-9 'iso2022 "ISO-8859-9 (Latin-5)"
- '(charset-g0 ascii
- charset-g1 latin-iso8859-9
- charset-g2 t
- charset-g3 t
- mnemonic "MIME/Ltn-5"
- ))
-
-;; Add a coding system for ISO 8859-16.
-(make-coding-system
- 'iso-8859-16 'iso2022 "MIME ISO-8859-16"
- '(charset-g0 ascii
- charset-g1 latin-iso8859-16
- charset-g2 t ; grrr
- charset-g3 t ; grrr
- mnemonic "MIME/Ltn-10"))
-
-(loop for ((charset codesys default-input nice-charset-1 nice-charset-2
- supported-langs ;; a list if the doc string is replaced
- ;; entirely
- )
- langenvs) in
- '(
- ((latin-iso8859-1 iso-8859-1 "latin-1-prefix" "Latin-1"
"ISO-8859-1"
-" Danish, Dutch, English, Faeroese, Finnish, French, German, Icelandic,
- Irish, Italian, Norwegian, Portuguese, Spanish, and Swedish.")
- (("Danish" "da")
- ("Dutch" "nl" "TUTORIAL.nl")
- ("Faeroese")
- ("Finnish" "fi")
- ("French" "fr" "TUTORIAL.fr" "Bonjour, ,Ag(Ba
va?")
- ("German" "de" "TUTORIAL.de" "\
-German (Deutsch Nord) Guten Tag
-German (Deutsch S,A|(Bd) Gr,A|_(B Gott"
- "german-postfix")
- ("Icelandic" "is")
- ("Irish" "ga")
- ("Italian" "it")
- ("Norwegian" "no" "TUTORIAL.no")
- ("Portuguese" "pt" nil "Bem-vindo! Tudo bem?")
- ("Spanish" "es" "TUTORIAL.es"
",A!(BHola!")
- ("Swedish" "sv" "TUTORIAL.se" "Hej!")))
- ((latin-iso8859-15 iso-8859-15 "latin-1-prefix" ;; #### FIXME
- "Latin-9" "ISO-8859-15"
- ("\
-This language environment is a generic one for Latin-9 (ISO-8859-15)
-character set which supports the Euro sign and the following languages
-(they use the Latin-1 character set by default):
- Danish, Dutch, English, Faeroese, Finnish, French, German, Icelandic,
- Irish, Italian, Norwegian, Portuguese, Spanish, and Swedish.
-Each also has its own specific language environment."))
- ())
- ((latin-iso8859-2 iso-8859-2 "latin-2-prefix" "Latin-2"
"ISO-8859-2"
-" Albanian, Czech, English, German, Hungarian, Polish, Romanian,
- Serbian, Croatian, Slovak, Slovene, Sorbian (upper and lower),
- and Swedish.")
- (("Albanian" nil)
- ("Croatian" ("hrvatski" "hr")
"TUTORIAL.hr")
- ("Czech" ("cs" "cz") "TUTORIAL.cs"
"P,Bx(Bejeme v,Ba(Bm hezk,B}(B den!"
- "latin-2-postfix")
- ("Hungarian" ("hungarian" "hu"))
- ("Polish" "po" "TUTORIAL.pl")
- ("Romanian" "ro" "TUTORIAL.ro" "Bun,Bc(B ziua,
bine a,B~(Bi venit!"
- "latin-2-postfix")
- ("Serbian" "sr")
- ("Slovak" "sk" "TUTORIAL.sk" "Prajeme V,Ba(Bm
pr,Bm(Bjemn,B}(B de,Br(B!"
- ;; !!#### FSF "slovak"
- "latin-2-postfix")
- ("Slovenian" "sl" "TUTORIAL.sl" ",B.(Belimo
vam uspe,B9(Ben dan!"
- "latin-2-postfix")
- ("Sorbian" nil)))
- ((latin-iso8859-3 iso-8859-3 "latin-3-prefix" "Latin-3"
"ISO-8859-3"
-" Afrikaans, Catalan, Dutch, English, Esperanto, French, Galician,
- German, Italian, Maltese, Spanish, and Turkish.")
- (("Afrikaans" "af")
- ("Catalan" ("catalan" "ca"))
- ("Esperanto")
- ("Galician")
- ("Maltese")))
- ((latin-iso8859-4 iso-8859-4 "latin-4-prefix" "Latin-4"
"ISO-8859-4"
-" Danish, English, Estonian, Finnish, German, Greenlandic, Lappish,
- Latvian, Lithuanian, and Norwegian.")
- (("Estonian" "et")
- ("Greenlandic")
- ("Lappish")
- ("Latvian" "lv")
- ("Lithuanian" "li")))
- ((latin-iso8859-5 iso-8859-9 "latin-5-prefix" "Latin-5"
"ISO-8859-9")
- (("Turkish" "tr"))))
- do
- (set-language-info-alist
- nice-charset-1
- `((charset ascii ,charset)
- (coding-system ,codesys)
- (coding-priority ,codesys)
- (documentation . ,(if (listp supported-langs) (car supported-langs)
- (format "\
-This language environment is a generic one for %s (%s)
-character set which supports the following languages (not all of them may
-use this character set by default):
-%s
-Each also has its own specific language environment."
- nice-charset-1 nice-charset-2
- supported-langs))))
- '("European"))
- (loop for (name locale tutorial sample-text input-method) in langenvs
- do
- (set-language-info-alist
- name
- `((charset ascii ,charset)
- (coding-system ,codesys)
- (coding-priority ,codesys)
- ,@(if locale `((locale . ,locale)))
- ,@(if tutorial `((tutorial . ,tutorial)))
- ,@(if sample-text `((sample-text . ,sample-text)))
- (input-method . ,(or input-method default-input))
- (documentation . ,(format "\
-This language environment supports %s using the Latin-1 (ISO-8859-1)
-character set. Languages supported by Latin-1 are Danish, Dutch, English,
-Faeroese, Finnish, French, German, Icelandic, Irish, Italian, Norwegian,
-Portuguese, Spanish, and Swedish. The various language environments for
-these languages are similar to the Latin-1 environment, but typically have
-their own locale specified (for subprocesses and for selection of the
-correct language environment at startup), and may have their own tutorials
-and/or a different input method."
- name)))
- '("European"))
- ))
-
-;;; european.el ends here
Index: lisp/mule/general-late.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/mule/general-late.el,v
retrieving revision 1.1
diff -u -u -r1.1 general-late.el
--- lisp/mule/general-late.el 2006/11/28 21:20:30 1.1
+++ lisp/mule/general-late.el 2006/12/21 22:56:55
@@ -52,4 +52,10 @@
(cons (assoc "English" language-info-alist)
(remassoc "English" language-info-alist)))
+;; At this point in the dump, all the charsets have been loaded. Now, load
+;; their Unicode mappings.
+(if load-unicode-tables-at-dump-time
+ (let ((data-directory (expand-file-name "etc" source-directory)))
+ (load-unicode-tables)))
+
;;; general-late.el ends here
Index: lisp/mule/greek.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/mule/greek.el,v
retrieving revision 1.6
diff -u -u -r1.6 greek.el
--- lisp/mule/greek.el 2006/12/17 13:23:50 1.6
+++ lisp/mule/greek.el 2006/12/21 22:56:55
@@ -4,7 +4,7 @@
;; Licensed to the Free Software Foundation.
;; Copyright (C) 1997 MORIOKA Tomohiko
-;; Keywords: multilingual, Greek
+;; Keywords: multilingual, Greek, dumped
;; This file is part of XEmacs.
@@ -29,6 +29,7 @@
;;; Code:
+;; Case table:
(loop
for (upper lower)
in '((#xdb #xfb) ;; UPSILON WITH DIALYTIKA
@@ -75,32 +76,48 @@
(put-case-table-pair (make-char 'greek-iso8859-7 upper)
(make-char 'greek-iso8859-7 lower) case-table))
-;; Now, syntax.
-(dolist (code '(#xA1 ;; LEFT SINGLE QUOTATION MARK
- #xA2 ;; RIGHT SINGLE QUOTATION MARK
- #xA3 ;; POUND SIGN
- #xA6 ;; BROKEN BAR
- #xA7 ;; SECTION SIGN
- #xA8 ;; DIAERESIS
- #xA9 ;; COPYRIGHT SIGN
- #xAB ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
- #xAC ;; NOT SIGN
- #xAD ;; SOFT HYPHEN
- #xAF ;; HORIZONTAL BAR
- #xB0 ;; DEGREE SIGN
- #xB1 ;; PLUS-MINUS SIGN
- #xB7 ;; MIDDLE DOT
- #xBB)) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
- (modify-syntax-entry (make-char 'greek-iso8859-7 code) "."))
-
-;; NO-BREAK SPACE
-(modify-syntax-entry (make-char 'greek-iso8859-7 #xA0) " ")
-
-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
-;;; GREEK
-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Now, syntax. Copy from appropriate characters in Latin 1.
-
+;; This code requires that the guillemets not have parenthesis syntax.
+
+(assert (not (memq (char-syntax (make-char 'latin-iso8859-1 #xAB)) '(?\( ?\))))
+ t "This code assumes \xAB does not have parenthesis syntax. ")
+
+(assert (not (memq (char-syntax (make-char 'latin-iso8859-1 #xBB)) '(?\( ?\))))
+ t "This code assumes \xBB does not have parenthesis syntax. ")
+
+(loop
+ for (greek latin-1)
+ in '((#xA0 #xA0) ;; NO BREAK SPACE
+ (#xA1 #xAB) ;; LEFT SINGLE QUOTATION MARK, LEFT DOUBLE ANGLE QUOTE
+ (#xA2 #xBB) ;; RIGHT SINGLE QUOTATION MARK, RIGHT DOUBLE ANGLE QUOTE
+ (#xA3 #xA3) ;; POUND SIGN
+ (#xA4 #xA3) ;; EURO SIGN, POUND SIGN
+ (#xA5 #xA3) ;; DRACHMA SIGN, POUND SIGN
+ (#xA6 #xA6) ;; BROKEN BAR
+ (#xA7 #xA7) ;; SECTION SIGN
+ (#xA8 #xA8) ;; DIAERESIS
+ (#xA9 #xA9) ;; COPYRIGHT SIGN
+ (#xAA #xB4) ;; GREEK YPOGEGRAMMENI (iota subscript), ACUTE ACCENT
+ (#xAB #xAB) ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xAC #xAC) ;; NOT SIGN
+ (#xAD #xAD) ;; SOFT HYPHEN
+ (#xAF #xA6) ;; HORIZONTAL BAR, BROKEN BAR
+ (#xB0 #xB0) ;; DEGREE SIGN
+ (#xB1 #xB1) ;; PLUS-MINUS SIGN
+ (#xB2 #xB2) ;; SUPERSCRIPT TWO
+ (#xB3 #xB3) ;; SUPERSCRIPT THREE
+ (#xB4 #xB4) ;; GREEK TONOS, ACUTE ACCENT
+ (#xB5 #xB4) ;; GREEK DIALYTIKA TONOS, ACUTE ACCENT
+ (#xB7 #xB7) ;; MIDDLE DOT
+ (#xBB #xBB) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (#xBD #xBD)) ;; VULGAR FRACTION ONE HALF
+ with syntax-table = (standard-syntax-table)
+ do (modify-syntax-entry
+ (make-char 'greek-iso8859-7 greek)
+ (string (char-syntax (make-char 'latin-iso8859-1 latin-1)))
+ syntax-table))
+
(make-coding-system
'iso-8859-7 'iso2022 "ISO-8859-7 (Greek)"
'(charset-g0 ascii
Index: lisp/mule/latin.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/mule/latin.el,v
retrieving revision 1.4
diff -u -u -r1.4 latin.el
--- lisp/mule/latin.el 2005/05/10 17:03:00 1.4
+++ lisp/mule/latin.el 2006/12/21 22:56:55
@@ -1,10 +1,12 @@
-;;; latin.el --- Support for Latin charsets. -*- coding: iso-2022-7bit; -*-
+;;; latin.el --- Roman-alphabet languages -*- coding: iso-2022-7bit; -*-
-;; Copyright (C) 2001, 2005 Free Software Foundation, Inc.
+;; Copyright (C) 1995 Electrotechnical Laboratory, JAPAN.
+;; Licensed to the Free Software Foundation.
+;; Copyright (C) 1997 MORIOKA Tomohiko
+;; Copyright (C) 2001 Ben Wing.
+;; Copyright (C) 2002, 2005, 2006 Free Software Foundation
-;; Author: Hrvoje Niksic <hniksic(a)xemacs.org>
-;; Maintainer: XEmacs Development Team
-;; Keywords: multilingual, European, dumped
+;; Keywords: multilingual, latin, dumped
;; This file is part of XEmacs.
@@ -25,12 +27,9 @@
;;; Commentary:
-;; This file is meant to provide support for Latin character sets.
-;; The place for that used to be `european.el', but I am hesitant to
-;; change that file, as it is full of old cruft that I hope to phase
-;; out. Currently this file provides only the case table setup.
+;; For Roman-alphabet-using Europeans, eight coded character sets,
+;; ISO8859-1,2,3,4,9,14,15,16 are supported.
-
;;; Code:
;; Case table setup. We set up all the case tables using
@@ -39,344 +38,691 @@
;; updated by Erik Naggum.
(defun setup-case-pairs (charset pairs)
- (let ((tbl (standard-case-table)))
- (loop for (uc lc) in pairs do
- (put-case-table-pair (make-char charset uc) (make-char charset lc) tbl))))
-
-;; Latin 1.
-
-(setup-case-pairs
- 'latin-iso8859-1
- '((192 224) ;latin letter a with grave
- (193 225) ;latin letter a with acute
- (194 226) ;latin letter a with circumflex
- (195 227) ;latin letter a with tilde
- (196 228) ;latin letter a with diaeresis
- (197 229) ;latin letter a with ring above
- (198 230) ;latin letter ae
- (199 231) ;latin letter c with cedilla
- (200 232) ;latin letter e with grave
- (201 233) ;latin letter e with acute
- (202 234) ;latin letter e with circumflex
- (203 235) ;latin letter e with diaeresis
- (204 236) ;latin letter i with grave
- (205 237) ;latin letter i with acute
- (206 238) ;latin letter i with circumflex
- (207 239) ;latin letter i with diaeresis
- (208 240) ;latin letter eth
- (209 241) ;latin letter n with tilde
- (210 242) ;latin letter o with grave
- (211 243) ;latin letter o with acute
- (212 244) ;latin letter o with circumflex
- (213 245) ;latin letter o with tilde
- (214 246) ;latin letter o with diaeresis
- (216 248) ;latin letter o with stroke
- (217 249) ;latin letter u with grave
- (218 250) ;latin letter u with acute
- (219 251) ;latin letter u with circumflex
- (220 252) ;latin letter u with diaeresis
- (221 253) ;latin letter y with acute
- (222 254) ;latin letter thorn
- ))
+ (loop
+ for (uc lc) in pairs
+ with table = (standard-case-table)
+ do (put-case-table-pair
+ (make-char charset uc) (make-char charset lc) table)))
+
+;; Latin-1's case is dealt with in iso8859-1.el, which see. Its syntax is
+;; initialised in syntax.c:complex_vars_of_syntax.
-;; Latin 2.
+
+;; Latin-2 (ISO-8859-2). Central Europe; Czech, Slovak, Hungarian, Polish,
+;; Croatian, other languages.
+;;
+;; (Yes, it really is Central European. German written in Latin 2 and using
+;; only Umlaute and the sharp S in its non-ASCII repertoire is bit-for-bit
+;; identical with the same text in Latin-1.)
+
+;; The default character syntax is now word. Pay attention to the
+;; exceptions in ISO-8859-2, copying them from ISO-8859-1.
+(loop
+ for (latin-2 latin-1)
+ in '((#xA0 #xA0) ;; NO BREAK SPACE
+ (#xA2 #xB4) ;; BREVE, ACUTE ACCENT
+ (#xA4 #xA4) ;; CURRENCY SIGN
+ (#xA7 #xA7) ;; SECTION SIGN
+ (#xA8 #xA8) ;; DIAERESIS
+ (#xAD #xAD) ;; SOFT HYPHEN
+ (#xB0 #xB0) ;; DEGREE SIGN
+ (#xB2 #xB4) ;; OGONEK, ACUTE ACCENT
+ (#xB4 #xB4) ;; ACUTE ACCENT
+ (#xB7 #xB4) ;; CARON, ACUTE ACCENT
+ (#xB8 #xB8) ;; CEDILLA
+ (#xBD #xB4) ;; DOUBLE ACUTE ACCENT, ACUTE ACCENT
+ (#xD7 #xD7) ;; MULTIPLICATION SIGN
+ (#xF7 #xF7) ;; DIVISION SIGN
+ (#xFF #xB4)) ;; DOT ABOVE, ACUTE ACCENT
+ with syntax-table = (standard-syntax-table)
+ do (modify-syntax-entry
+ (make-char 'latin-iso8859-2 latin-2)
+ (string (char-syntax (make-char 'latin-iso8859-1 latin-1)))
+ syntax-table))
+;; Case.
(setup-case-pairs
'latin-iso8859-2
- '((161 177) ;latin letter a with ogonek
- (163 179) ;latin letter l with stroke
- (165 181) ;latin letter l with caron
- (166 182) ;latin letter s with acute
- (169 185) ;latin letter s with caron
- (170 186) ;latin letter s with cedilla
- (171 187) ;latin letter t with caron
- (172 188) ;latin letter z with acute
- (174 190) ;latin letter z with caron
- (175 191) ;latin letter z with dot above
- (192 224) ;latin letter r with acute
- (193 225) ;latin letter a with acute
- (194 226) ;latin letter a with circumflex
- (195 227) ;latin letter a with breve
- (196 228) ;latin letter a with diaeresis
- (197 229) ;latin letter l with acute
- (198 230) ;latin letter c with acute
- (199 231) ;latin letter c with cedilla
- (200 232) ;latin letter c with caron
- (201 233) ;latin letter e with acute
- (202 234) ;latin letter e with ogonek
- (203 235) ;latin letter e with diaeresis
- (204 236) ;latin letter e with caron
- (205 237) ;latin letter i with acute
- (206 238) ;latin letter i with circumflex
- (207 239) ;latin letter d with caron
- (208 240) ;latin letter d with stroke
- (209 241) ;latin letter n with acute
- (210 242) ;latin letter n with caron
- (211 243) ;latin letter o with acute
- (212 244) ;latin letter o with circumflex
- (213 245) ;latin letter o with double acute
- (214 246) ;latin letter o with diaeresis
- (216 248) ;latin letter r with caron
- (217 249) ;latin letter u with ring above
- (218 250) ;latin letter u with acute
- (219 251) ;latin letter u with double acute
- (220 252) ;latin letter u with diaeresis
- (221 253) ;latin letter y with acute
- (222 254) ;latin letter t with cedilla
- ))
+ '((#xA1 #xB1) ;; A WITH OGONEK
+ (#xA3 #xB3) ;; L WITH STROKE
+ (#xA5 #xB5) ;; L WITH CARON
+ (#xA6 #xB6) ;; S WITH ACUTE
+ (#xA9 #xB9) ;; S WITH CARON
+ (#xAA #xBA) ;; S WITH CEDILLA
+ (#xAB #xBB) ;; T WITH CARON
+ (#xAC #xBC) ;; Z WITH ACUTE
+ (#xAE #xBE) ;; Z WITH CARON
+ (#xAF #xBF) ;; Z WITH DOT ABOVE
+ (#xC0 #xE0) ;; R WITH ACUTE
+ (#xC1 #xE1) ;; A WITH ACUTE
+ (#xC2 #xE2) ;; A WITH CIRCUMFLEX
+ (#xC3 #xE3) ;; A WITH BREVE
+ (#xC4 #xE4) ;; A WITH DIAERESIS
+ (#xC5 #xE5) ;; L WITH ACUTE
+ (#xC6 #xE6) ;; C WITH ACUTE
+ (#xC7 #xE7) ;; C WITH CEDILLA
+ (#xC8 #xE8) ;; C WITH CARON
+ (#xC9 #xE9) ;; E WITH ACUTE
+ (#xCA #xEA) ;; E WITH OGONEK
+ (#xCB #xEB) ;; E WITH DIAERESIS
+ (#xCC #xEC) ;; E WITH CARON
+ (#xCD #xED) ;; I WITH ACUTE
+ (#xCE #xEE) ;; I WITH CIRCUMFLEX
+ (#xCF #xEF) ;; D WITH CARON
+ (#xD0 #xF0) ;; D WITH STROKE
+ (#xD1 #xF1) ;; N WITH ACUTE
+ (#xD2 #xF2) ;; N WITH CARON
+ (#xD3 #xF3) ;; O WITH ACUTE
+ (#xD4 #xF4) ;; O WITH CIRCUMFLEX
+ (#xD5 #xF5) ;; O WITH DOUBLE ACUTE
+ (#xD6 #xF6) ;; O WITH DIAERESIS
+ (#xD8 #xF8) ;; R WITH CARON
+ (#xD9 #xF9) ;; U WITH RING ABOVE
+ (#xDA #xFA) ;; U WITH ACUTE
+ (#xDB #xFB) ;; U WITH DOUBLE ACUTE
+ (#xDC #xFC) ;; U WITH DIAERESIS
+ (#xDD #xFD) ;; Y WITH ACUTE
+ (#xDE #xFE))) ;; T WITH CEDILLA
+
+(make-coding-system
+ 'iso-8859-2 'iso2022 "ISO-8859-2 (Latin-2)"
+ '(charset-g0 ascii
+ charset-g1 latin-iso8859-2
+ charset-g2 t
+ charset-g3 t
+ mnemonic "MIME/Ltn-2"))
+
+
+;;
+;; Latin-3 (ISO-8859-3). Esperanto, Maltese and Turkish. Obsolescent.
-;; Latin 3.
+;; Initialise the non-word syntax codes in ISO-8859-3, copying them from
+;; ISO-8859-1.
+(loop
+ for (latin-3 latin-1)
+ in '((#xA0 #xA0) ;; NO BREAK SPACE
+ (#xA2 #xB4) ;; BREVE, ACUTE ACCENT
+ (#xA3 #xA3) ;; POUND SIGN
+ (#xA4 #xA4) ;; CURRENCY SIGN
+ (#xA7 #xA7) ;; SECTION SIGN
+ (#xA8 #xA8) ;; DIAERESIS
+ (#xAD #xAD) ;; SOFT HYPHEN
+ (#xB0 #xB0) ;; DEGREE SIGN
+ (#xB2 #xB2) ;; SUPERSCRIPT TWO
+ (#xB3 #xB3) ;; SUPERSCRIPT THREE
+ (#xB4 #xB4) ;; ACUTE ACCENT
+ (#xB5 #xB5) ;; MICRO SIGN
+ (#xB7 #xB7) ;; MIDDLE DOT
+ (#xB8 #xB8) ;; CEDILLA
+ (#xBD #xBD) ;; VULGAR FRACTION ONE HALF
+ (#xD7 #xD7) ;; MULTIPLICATION SIGN
+ (#xF7 #xF7) ;; DIVISION SIGN
+ (#xFF #xB4)) ;; DOT ABOVE, ACUTE ACCENT
+ with syntax-table = (standard-syntax-table)
+ do (modify-syntax-entry
+ (make-char 'latin-iso8859-3 latin-3)
+ (string (char-syntax (make-char 'latin-iso8859-1 latin-1)))
+ syntax-table))
+;; Case.
(setup-case-pairs
'latin-iso8859-3
- '((161 177) ;latin letter h with stroke
- (166 182) ;latin letter h with circumflex
- (170 186) ;latin letter s with cedilla
- (171 187) ;latin letter g with breve
- (172 188) ;latin letter j with circumflex
- (175 191) ;latin letter z with dot above
- (192 224) ;latin letter a with grave
- (193 225) ;latin letter a with acute
- (194 226) ;latin letter a with circumflex
- (196 228) ;latin letter a with diaeresis
- (197 229) ;latin letter c with dot above
- (198 230) ;latin letter c with circumflex
- (199 231) ;latin letter c with cedilla
- (200 232) ;latin letter e with grave
- (201 233) ;latin letter e with acute
- (202 234) ;latin letter e with circumflex
- (203 235) ;latin letter e with diaeresis
- (204 236) ;latin letter i with grave
- (205 237) ;latin letter i with acute
- (206 238) ;latin letter i with circumflex
- (207 239) ;latin letter i with diaeresis
- (209 241) ;latin letter n with tilde
- (210 242) ;latin letter o with grave
- (211 243) ;latin letter o with acute
- (212 244) ;latin letter o with circumflex
- (213 245) ;latin letter g with dot above
- (214 246) ;latin letter o with diaeresis
- (216 248) ;latin letter g with circumflex
- (217 249) ;latin letter u with grave
- (218 250) ;latin letter u with acute
- (219 251) ;latin letter u with circumflex
- (220 252) ;latin letter u with diaeresis
- (221 253) ;latin letter u with breve
- (222 254) ;latin letter s with circumflex
- ))
+ '((#xA1 #xB1) ;; H WITH STROKE
+ (#xA6 #xB6) ;; H WITH CIRCUMFLEX
+ (#xAA #xBA) ;; S WITH CEDILLA
+ (#xAB #xBB) ;; G WITH BREVE
+ (#xAC #xBC) ;; J WITH CIRCUMFLEX
+ (#xAF #xBF) ;; Z WITH DOT ABOVE
+ (#xC0 #xE0) ;; A WITH GRAVE
+ (#xC1 #xE1) ;; A WITH ACUTE
+ (#xC2 #xE2) ;; A WITH CIRCUMFLEX
+ (#xC4 #xE4) ;; A WITH DIAERESIS
+ (#xC5 #xE5) ;; C WITH DOT ABOVE
+ (#xC6 #xE6) ;; C WITH CIRCUMFLEX
+ (#xC7 #xE7) ;; C WITH CEDILLA
+ (#xC8 #xE8) ;; E WITH GRAVE
+ (#xC9 #xE9) ;; E WITH ACUTE
+ (#xCA #xEA) ;; E WITH CIRCUMFLEX
+ (#xCB #xEB) ;; E WITH DIAERESIS
+ (#xCC #xEC) ;; I WITH GRAVE
+ (#xCD #xED) ;; I WITH ACUTE
+ (#xCE #xEE) ;; I WITH CIRCUMFLEX
+ (#xCF #xEF) ;; I WITH DIAERESIS
+ (#xD1 #xF1) ;; N WITH TILDE
+ (#xD2 #xF2) ;; O WITH GRAVE
+ (#xD3 #xF3) ;; O WITH ACUTE
+ (#xD4 #xF4) ;; O WITH CIRCUMFLEX
+ (#xD5 #xF5) ;; G WITH DOT ABOVE
+ (#xD6 #xF6) ;; O WITH DIAERESIS
+ (#xD8 #xF8) ;; G WITH CIRCUMFLEX
+ (#xD9 #xF9) ;; U WITH GRAVE
+ (#xDA #xFA) ;; U WITH ACUTE
+ (#xDB #xFB) ;; U WITH CIRCUMFLEX
+ (#xDC #xFC) ;; U WITH DIAERESIS
+ (#xDD #xFD) ;; U WITH BREVE
+ (#xDE #xFE))) ;; S WITH CIRCUMFLEX
+
+(make-coding-system
+ 'iso-8859-3 'iso2022 "ISO-8859-3 (Latin-3)"
+ '(charset-g0 ascii
+ charset-g1 latin-iso8859-3
+ charset-g2 t
+ charset-g3 t
+ mnemonic "MIME/Ltn-3"))
+
+
+;; Latin-4 (ISO-8859-4)
-;; Latin 4.
+;; Estonian, Latvian, Lithuanian, Greenlandic, and Sami. Obsolescent.
+;; The default character syntax is now word. Pay attention to the
+;; exceptions in ISO-8859-4, copying them from ISO-8859-1.
+(loop
+ for (latin-4 latin-1)
+ in '((#xA0 #xA0) ;; NO BREAK SPACE
+ (#xA4 #xA4) ;; CURRENCY SIGN
+ (#xA7 #xA7) ;; SECTION SIGN
+ (#xA8 #xA8) ;; DIAERESIS
+ (#xAD #xAD) ;; SOFT HYPHEN
+ (#xB0 #xB0) ;; DEGREE SIGN
+ (#xB2 #xB4) ;; OGONEK, ACUTE ACCENT
+ (#xB4 #xB4) ;; ACUTE ACCENT
+ (#xB7 #xB4) ;; CARON, ACUTE ACCENT
+ (#xB8 #xB8) ;; CEDILLA
+ (#xD7 #xD7) ;; MULTIPLICATION SIGN
+ (#xF7 #xF7) ;; DIVISION SIGN
+ (#xFF #xB4)) ;; DOT ABOVE, ACUTE ACCENT
+ with syntax-table = (standard-syntax-table)
+ do (modify-syntax-entry
+ (make-char 'latin-iso8859-4 latin-4)
+ (string (char-syntax (make-char 'latin-iso8859-1 latin-1)))
+ syntax-table))
+
+;; Case.
(setup-case-pairs
'latin-iso8859-4
- '((161 177) ;latin letter a with ogonek
- (163 179) ;latin letter r with cedilla
- (165 181) ;latin letter i with tilde
- (166 182) ;latin letter l with cedilla
- (169 185) ;latin letter s with caron
- (170 186) ;latin letter e with macron
- (171 187) ;latin letter g with cedilla
- (172 188) ;latin letter t with stroke
- (174 190) ;latin letter z with caron
- (189 191) ;eng
- (192 224) ;latin letter a with macron
- (193 225) ;latin letter a with acute
- (194 226) ;latin letter a with circumflex
- (195 227) ;latin letter a with tilde
- (196 228) ;latin letter a with diaeresis
- (197 229) ;latin letter a with ring above
- (198 230) ;latin letter ae
- (199 231) ;latin letter i with ogonek
- (200 232) ;latin letter c with caron
- (201 233) ;latin letter e with acute
- (202 234) ;latin letter e with ogonek
- (203 235) ;latin letter e with diaeresis
- (204 236) ;latin letter e with dot above
- (205 237) ;latin letter i with acute
- (206 238) ;latin letter i with circumflex
- (207 239) ;latin letter i with macron
- (208 240) ;latin letter d with stroke
- (209 241) ;latin letter n with cedilla
- (210 242) ;latin letter o with macron
- (211 243) ;latin letter k with cedilla
- (212 244) ;latin letter o with circumflex
- (213 245) ;latin letter o with tilde
- (214 246) ;latin letter o with diaeresis
- (216 248) ;latin letter o with stroke
- (217 249) ;latin letter u with ogonek
- (218 250) ;latin letter u with acute
- (219 251) ;latin letter u with circumflex
- (220 252) ;latin letter u with diaeresis
- (221 253) ;latin letter u with tilde
- (222 254) ;latin letter u with macron
- ))
-
-;; Latin 5. Currently unsupported.
-
-;(setup-case-pairs
-; 'latin-iso8859-5
-; '((192 224) ;latin letter a with grave
-; (193 225) ;latin letter a with acute
-; (194 226) ;latin letter a with circumflex
-; (195 227) ;latin letter a with tilde
-; (196 228) ;latin letter a with diaeresis
-; (197 229) ;latin letter a with ring above
-; (198 230) ;latin letter ae
-; (199 231) ;latin letter c with cedilla
-; (200 232) ;latin letter e with grave
-; (201 233) ;latin letter e with acute
-; (203 235) ;latin letter e with diaeresis
-; (205 237) ;latin letter i with acute
-; (206 238) ;latin letter i with circumflex
-; (208 240) ;latin letter g with breve
-; (209 241) ;latin letter n with tilde
-; (210 242) ;latin letter o with grave
-; (211 243) ;latin letter o with acute
-; (212 244) ;latin letter o with circumflex
-; (213 245) ;latin letter o with tilde
-; (214 246) ;latin letter o with diaeresis
-; (216 248) ;latin letter o with stroke
-; (217 249) ;latin letter u with grave
-; (218 250) ;latin letter u with acute
-; (219 251) ;latin letter u with circumflex
-; (220 252) ;latin letter u with diaeresis
-; (222 254) ;latin letter s with cedilla
-; ))
+ '((#xA1 #xB1) ;; A WITH OGONEK
+ (#xA3 #xB3) ;; R WITH CEDILLA
+ (#xA5 #xB5) ;; I WITH TILDE
+ (#xA6 #xB6) ;; L WITH CEDILLA
+ (#xA9 #xB9) ;; S WITH CARON
+ (#xAA #xBA) ;; E WITH MACRON
+ (#xAB #xBB) ;; G WITH CEDILLA
+ (#xAC #xBC) ;; T WITH STROKE
+ (#xAE #xBE) ;; Z WITH CARON
+ (#xBD #xBF) ;; ENG
+ (#xC0 #xE0) ;; A WITH MACRON
+ (#xC1 #xE1) ;; A WITH ACUTE
+ (#xC2 #xE2) ;; A WITH CIRCUMFLEX
+ (#xC3 #xE3) ;; A WITH TILDE
+ (#xC4 #xE4) ;; A WITH DIAERESIS
+ (#xC5 #xE5) ;; A WITH RING ABOVE
+ (#xC6 #xE6) ;; AE
+ (#xC7 #xE7) ;; I WITH OGONEK
+ (#xC8 #xE8) ;; C WITH CARON
+ (#xC9 #xE9) ;; E WITH ACUTE
+ (#xCA #xEA) ;; E WITH OGONEK
+ (#xCB #xEB) ;; E WITH DIAERESIS
+ (#xCC #xEC) ;; E WITH DOT ABOVE
+ (#xCD #xED) ;; I WITH ACUTE
+ (#xCE #xEE) ;; I WITH CIRCUMFLEX
+ (#xCF #xEF) ;; I WITH MACRON
+ (#xD0 #xF0) ;; D WITH STROKE
+ (#xD1 #xF1) ;; N WITH CEDILLA
+ (#xD2 #xF2) ;; O WITH MACRON
+ (#xD3 #xF3) ;; K WITH CEDILLA
+ (#xD4 #xF4) ;; O WITH CIRCUMFLEX
+ (#xD5 #xF5) ;; O WITH TILDE
+ (#xD6 #xF6) ;; O WITH DIAERESIS
+ (#xD8 #xF8) ;; O WITH STROKE
+ (#xD9 #xF9) ;; U WITH OGONEK
+ (#xDA #xFA) ;; U WITH ACUTE
+ (#xDB #xFB) ;; U WITH CIRCUMFLEX
+ (#xDC #xFC) ;; U WITH DIAERESIS
+ (#xDD #xFD) ;; U WITH TILDE
+ (#xDE #xFE))) ;; U WITH MACRON
+
+(make-coding-system
+ 'iso-8859-4 'iso2022 "ISO-8859-4 (Latin-4)"
+ '(charset-g0 ascii
+ charset-g1 latin-iso8859-4
+ charset-g2 t
+ charset-g3 t
+ mnemonic "MIME/Ltn-4"))
-;; Latin 9.
-(setup-case-pairs
- 'latin-iso8859-15
- '((166 168) ;latin letter s with caron *
- (180 184) ;latin letter z with caron *
- (188 189) ;latin ligature oe *
- (190 255) ;latin letter y with diaeresis *
- (192 224) ;latin letter a with grave
- (193 225) ;latin letter a with acute
- (194 226) ;latin letter a with circumflex
- (195 227) ;latin letter a with tilde
- (196 228) ;latin letter a with diaeresis
- (197 229) ;latin letter a with ring above
- (198 230) ;latin letter ae
- (199 231) ;latin letter c with cedilla
- (200 232) ;latin letter e with grave
- (201 233) ;latin letter e with acute
- (202 234) ;latin letter e with circumflex
- (203 235) ;latin letter e with diaeresis
- (204 236) ;latin letter i with grave
- (205 237) ;latin letter i with acute
- (206 238) ;latin letter i with circumflex
- (207 239) ;latin letter i with diaeresis
- (208 240) ;latin letter eth
- (209 241) ;latin letter n with tilde
- (210 242) ;latin letter o with grave
- (211 243) ;latin letter o with acute
- (212 244) ;latin letter o with circumflex
- (213 245) ;latin letter o with tilde
- (214 246) ;latin letter o with diaeresis
- (216 248) ;latin letter o with stroke
- (217 249) ;latin letter u with grave
- (218 250) ;latin letter u with acute
- (219 251) ;latin letter u with circumflex
- (220 252) ;latin letter u with diaeresis
- (221 253) ;latin letter y with acute
- (222 254) ;latin letter thorn
- ))
+
+;; Latin-8 (ISO 8859-14) Celtic.
+
+;; Never widely used. Current-orthography Gaelic, both Irish and Scots, is
+;; easily written with Latin-1. Wikipedia says the same about Welsh.
-;; ISO 8859-14, not in FSF, our mapping.
+(make-charset 'latin-iso8859-14
+ "Right-Hand Part of Latin Alphabet 8 (ISO/IEC 8859-14)"
+ '(dimension 1
+ registries ["ISO8859-14"]
+ chars 96
+ columns 1
+ direction l2r
+ final ?_
+ graphic 1
+ short-name "RHP of Latin-8"
+ long-name "RHP of Latin-8 (ISO 8859-14)"))
+
+;;
+;; Character syntax defaults to word. The exceptions here shared with Latin-1.
+(dolist (code '(#xa0 ;; NO BREAK SPACE
+ #xa3 ;; POUND SIGN
+ #xa7 ;; SECTION SIGN
+ #xa9 ;; COPYRIGHT
+ #xad ;; SOFT HYPHEN
+ #xae ;; REGISTERED
+ #xb6)) ;; PILCROW SIGN
+ (modify-syntax-entry (make-char 'latin-iso8859-14 code)
+ (string (char-syntax (make-char 'latin-iso8859-1 code)))
+ (standard-syntax-table)))
+;; Case.
(setup-case-pairs
'latin-iso8859-14
- '((161 162) ;latin letter b with dot above
- (164 165) ;latin letter c with dot above
- (166 171) ;latin letter d with dot above
- (168 184) ;latin letter w with grave
- (170 186) ;latin letter w with acute
- (172 188) ;latin letter y with grave
- (175 255) ;latin letter y with diaeresis
- (176 177) ;latin letter f with dot above
- (178 179) ;latin letter g with dot above
- (180 181) ;latin letter m with dot above
- (183 185) ;latin letter p with dot above
- (187 191) ;latin letter s with dot above
- (189 190) ;latin letter w with diaeresis
- (192 224) ;latin letter a with grave
- (193 225) ;latin letter a with acute
- (194 226) ;latin letter a with circumflex
- (195 227) ;latin letter a with tilde
- (196 228) ;latin letter a with diaeresis
- (197 229) ;latin letter a with ring above
- (198 230) ;latin letter ae
- (199 231) ;latin letter c with cedilla
- (200 232) ;latin letter e with grave
- (201 233) ;latin letter e with acute
- (202 234) ;latin letter e with circumflex
- (203 235) ;latin letter e with diaeresis
- (204 236) ;latin letter i with grave
- (205 237) ;latin letter i with acute
- (206 238) ;latin letter i with circumflex
- (207 239) ;latin letter i with diaeresis
- (208 240) ;latin letter w with circumflex
- (209 241) ;latin letter n with tilde
- (210 242) ;latin letter o with grave
- (211 243) ;latin letter o with acute
- (212 244) ;latin letter o with circumflex
- (213 245) ;latin letter o with tilde
- (214 246) ;latin letter o with diaeresis
- (215 247) ;latin letter t with dot above
- (216 248) ;latin letter o with stroke
- (217 249) ;latin letter u with grave
- (218 250) ;latin letter u with acute
- (219 251) ;latin letter u with circumflex
- (220 252) ;latin letter u with diaeresis
- (221 253) ;latin letter y with acute
- (222 254) ;latin letter y with circumflex
- ))
+ '((#xA1 #xA2) ;; B WITH DOT ABOVE
+ (#xA4 #xA5) ;; C WITH DOT ABOVE
+ (#xA6 #xAB) ;; D WITH DOT ABOVE
+ (#xA8 #xB8) ;; W WITH GRAVE
+ (#xAA #xBA) ;; W WITH ACUTE
+ (#xAC #xBC) ;; Y WITH GRAVE
+ (#xAF #xFF) ;; Y WITH DIAERESIS
+ (#xB0 #xB1) ;; F WITH DOT ABOVE
+ (#xB2 #xB3) ;; G WITH DOT ABOVE
+ (#xB4 #xB5) ;; M WITH DOT ABOVE
+ (#xB7 #xB9) ;; P WITH DOT ABOVE
+ (#xBB #xBF) ;; S WITH DOT ABOVE
+ (#xBD #xBE) ;; W WITH DIAERESIS
+ (#xC0 #xE0) ;; A WITH GRAVE
+ (#xC1 #xE1) ;; A WITH ACUTE
+ (#xC2 #xE2) ;; A WITH CIRCUMFLEX
+ (#xC3 #xE3) ;; A WITH TILDE
+ (#xC4 #xE4) ;; A WITH DIAERESIS
+ (#xC5 #xE5) ;; A WITH RING ABOVE
+ (#xC6 #xE6) ;; AE
+ (#xC7 #xE7) ;; C WITH CEDILLA
+ (#xC8 #xE8) ;; E WITH GRAVE
+ (#xC9 #xE9) ;; E WITH ACUTE
+ (#xCA #xEA) ;; E WITH CIRCUMFLEX
+ (#xCB #xEB) ;; E WITH DIAERESIS
+ (#xCC #xEC) ;; I WITH GRAVE
+ (#xCD #xED) ;; I WITH ACUTE
+ (#xCE #xEE) ;; I WITH CIRCUMFLEX
+ (#xCF #xEF) ;; I WITH DIAERESIS
+ (#xD0 #xF0) ;; W WITH CIRCUMFLEX
+ (#xD1 #xF1) ;; N WITH TILDE
+ (#xD2 #xF2) ;; O WITH GRAVE
+ (#xD3 #xF3) ;; O WITH ACUTE
+ (#xD4 #xF4) ;; O WITH CIRCUMFLEX
+ (#xD5 #xF5) ;; O WITH TILDE
+ (#xD6 #xF6) ;; O WITH DIAERESIS
+ (#xD7 #xF7) ;; T WITH DOT ABOVE
+ (#xD8 #xF8) ;; O WITH STROKE
+ (#xD9 #xF9) ;; U WITH GRAVE
+ (#xDA #xFA) ;; U WITH ACUTE
+ (#xDB #xFB) ;; U WITH CIRCUMFLEX
+ (#xDC #xFC) ;; U WITH DIAERESIS
+ (#xDD #xFD) ;; Y WITH ACUTE
+ (#xDE #xFE))) ;; Y WITH CIRCUMFLEX
+
+
+;; The syntax table code for ISO 8859-15 and ISO 8859-16 requires that the
+;; guillemets not have parenthesis syntax, which they used to have in the
+;; past. See syntax.c:complex_vars_of_syntax.
+(assert (not (memq (char-syntax (make-char 'latin-iso8859-1 #xAB)) '(?\( ?\))))
+ t "This code assumes \xAB does not have parenthesis syntax. ")
+
+(assert (not (memq (char-syntax (make-char 'latin-iso8859-1 #xBB)) '(?\( ?\))))
+ t "This code assumes \xBB does not have parenthesis syntax. ")
+
+
+;; Latin-9 (ISO-8859-15)
+;;
+;; Latin-1 plus Euro, plus a few accented characters for the sake of correct
+;; Finnish and French orthography. Only ever widely used on Unix.
+
+;;
+;; Based on Latin-1 and differences therefrom.
+;;
+;; First, initialise the syntax from the corresponding Latin-1 characters.
+(loop
+ for c from #xa0 to #xff
+ with syntax-table = (standard-syntax-table)
+ do (modify-syntax-entry
+ (make-char 'latin-iso8859-15 c)
+ (string (char-syntax (make-char 'latin-iso8859-1 c)))
+ syntax-table))
+
+;; Now, the exceptions. The Euro sign retains the syntax of CURRENCY SIGN.
+(loop
+ for c in '(?,b&(B ?,b((B ?,b4(B ?,b8(B ?,b<(B ?,b=(B
?,b>(B)
+ with syntax-table = (standard-syntax-table)
+ do (modify-syntax-entry c "w" syntax-table))
-;; ISO 8859-16, not in FSF, our mapping.
+;; Case.
(setup-case-pairs
+ 'latin-iso8859-15
+ '((#xA6 #xA8) ;; S WITH CARON *
+ (#xB4 #xB8) ;; Z WITH CARON *
+ (#xBC #xBD) ;; LATIN LIGATURE OE *
+ (#xBE #xFF) ;; Y WITH DIAERESIS *
+ (#xC0 #xE0) ;; A WITH GRAVE
+ (#xC1 #xE1) ;; A WITH ACUTE
+ (#xC2 #xE2) ;; A WITH CIRCUMFLEX
+ (#xC3 #xE3) ;; A WITH TILDE
+ (#xC4 #xE4) ;; A WITH DIAERESIS
+ (#xC5 #xE5) ;; A WITH RING ABOVE
+ (#xC6 #xE6) ;; AE
+ (#xC7 #xE7) ;; C WITH CEDILLA
+ (#xC8 #xE8) ;; E WITH GRAVE
+ (#xC9 #xE9) ;; E WITH ACUTE
+ (#xCA #xEA) ;; E WITH CIRCUMFLEX
+ (#xCB #xEB) ;; E WITH DIAERESIS
+ (#xCC #xEC) ;; I WITH GRAVE
+ (#xCD #xED) ;; I WITH ACUTE
+ (#xCE #xEE) ;; I WITH CIRCUMFLEX
+ (#xCF #xEF) ;; I WITH DIAERESIS
+ (#xD0 #xF0) ;; ETH
+ (#xD1 #xF1) ;; N WITH TILDE
+ (#xD2 #xF2) ;; O WITH GRAVE
+ (#xD3 #xF3) ;; O WITH ACUTE
+ (#xD4 #xF4) ;; O WITH CIRCUMFLEX
+ (#xD5 #xF5) ;; O WITH TILDE
+ (#xD6 #xF6) ;; O WITH DIAERESIS
+ (#xD8 #xF8) ;; O WITH STROKE
+ (#xD9 #xF9) ;; U WITH GRAVE
+ (#xDA #xFA) ;; U WITH ACUTE
+ (#xDB #xFB) ;; U WITH CIRCUMFLEX
+ (#xDC #xFC) ;; U WITH DIAERESIS
+ (#xDD #xFD) ;; Y WITH ACUTE
+ (#xDE #xFE))) ;; THORN
+
+(make-coding-system
+ 'iso-8859-15 'iso2022
+ "ISO 4873 conforming 8-bit code (ASCII + Latin 9; aka Latin-1 with Euro)"
+ `(mnemonic "MIME/Ltn-9" ; bletch
+ eol-type nil
+ charset-g0 ascii
+ charset-g1 latin-iso8859-15
+ charset-g2 t
+ charset-g3 t))
+
+;; end of ISO 8859-15.
+
+;;
+;; Latin-10 (ISO 8859-16).
+;;
+;; "South-Eastern European." Not, to my knowledge, ever widely used.
+
+(make-charset 'latin-iso8859-16
+ "Right-Hand Part of Latin Alphabet 10 (ISO/IEC 8859-16)"
+ '(dimension 1
+ registries ["ISO8859-16"]
+ chars 96
+ columns 1
+ direction l2r
+ final ?f ; octet 06/06; cf ISO-IR 226
+ graphic 1
+ short-name "RHP of Latin-10"
+ long-name "RHP of Latin-10 (ISO 8859-16)"))
+
+;; Copy over the non-word syntax this charset has in common with Latin 1.
+(dolist (code '(#xa0 ;; NO BREAK SPACE
+ #xa7 ;; SECTION SIGN
+ #xa9 ;; COPYRIGHT
+ #xab ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+ #xad ;; SOFT HYPHEN
+ #xb0 ;; DEGREE
+ #xb1 ;; PLUS-MINUS SIGN
+ #xb6 ;; PILCROW SIGN
+ #xb7 ;; MIDDLE DOT
+ #xbb)) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+ (modify-syntax-entry (make-char 'latin-iso8859-16 code)
+ (string (char-syntax (make-char 'latin-iso8859-1 code)))
+ (standard-syntax-table)))
+
+;; EURO SIGN. Take its syntax from the pound sign.
+(modify-syntax-entry (make-char 'latin-iso8859-16 #xa4)
+ (string (char-syntax (make-char 'latin-iso8859-1 #xa3)))
+ (standard-syntax-table))
+
+;; Take DOUBLE LOW-9 QUOTATION MARK's syntax from that of LEFT-POINTING
+;; DOUBLE ANGLE QUOTATION MARK.
+(modify-syntax-entry (make-char 'latin-iso8859-16 #xa5)
+ (string (char-syntax (make-char 'latin-iso8859-1 #xab)))
+ (standard-syntax-table))
+
+;; Take RIGHT DOUBLE QUOTATION MARK's syntax from that of RIGHT-POINTING
+;; DOUBLE ANGLE QUOTATION MARK.
+(modify-syntax-entry (make-char 'latin-iso8859-16 #xb5)
+ (string (char-syntax (make-char 'latin-iso8859-1 #xbb)))
+ (standard-syntax-table))
+
+;; Case.
+(setup-case-pairs
'latin-iso8859-16
- '((161 162) ;latin letter a with ogonek
- (163 179) ;latin letter l with stroke
- (166 168) ;latin letter s with caron
- (170 186) ;latin letter s with comma below
- (172 174) ;latin letter z with acute
- (175 191) ;latin letter z with dot above
- (178 185) ;latin letter c with caron
- (180 184) ;latin letter z with caron
- (190 255) ;latin letter y with diaeresis
- (192 224) ;latin letter a with grave
- (193 225) ;latin letter a with acute
- (194 226) ;latin letter a with circumflex
- (195 227) ;latin letter a with breve
- (196 228) ;latin letter a with diaeresis
- (197 229) ;latin letter c with acute
- (198 230) ;latin letter ae
- (199 231) ;latin letter c with cedilla
- (200 232) ;latin letter e with grave
- (201 233) ;latin letter e with acute
- (202 234) ;latin letter e with circumflex
- (203 235) ;latin letter e with diaeresis
- (204 236) ;latin letter i with grave
- (205 237) ;latin letter i with acute
- (206 238) ;latin letter i with circumflex
- (207 239) ;latin letter i with diaeresis
- (208 240) ;latin letter d with stroke
- (209 241) ;latin letter n with acute
- (210 242) ;latin letter o with grave
- (211 243) ;latin letter o with acute
- (212 244) ;latin letter o with circumflex
- (213 245) ;latin letter o with double acute
- (214 246) ;latin letter o with diaeresis
- (215 247) ;latin letter s with acute
- (216 248) ;latin letter u with double acute
- (217 249) ;latin letter u with grave
- (218 250) ;latin letter u with acute
- (219 251) ;latin letter u with circumflex
- (220 252) ;latin letter u with diaeresis
- (221 253) ;latin letter e with ogonek
- (222 254) ;latin letter t with comma below
- ))
+ '((#xA1 #xA2) ;; A WITH OGONEK
+ (#xA3 #xB3) ;; L WITH STROKE
+ (#xA6 #xA8) ;; S WITH CARON
+ (#xAA #xBA) ;; S WITH COMMA BELOW
+ (#xAC #xAE) ;; Z WITH ACUTE
+ (#xAF #xBF) ;; Z WITH DOT ABOVE
+ (#xB2 #xB9) ;; C WITH CARON
+ (#xB4 #xB8) ;; Z WITH CARON
+ (#xBE #xFF) ;; Y WITH DIAERESIS
+ (#xC0 #xE0) ;; A WITH GRAVE
+ (#xC1 #xE1) ;; A WITH ACUTE
+ (#xC2 #xE2) ;; A WITH CIRCUMFLEX
+ (#xC3 #xE3) ;; A WITH BREVE
+ (#xC4 #xE4) ;; A WITH DIAERESIS
+ (#xC5 #xE5) ;; C WITH ACUTE
+ (#xC6 #xE6) ;; AE
+ (#xC7 #xE7) ;; C WITH CEDILLA
+ (#xC8 #xE8) ;; E WITH GRAVE
+ (#xC9 #xE9) ;; E WITH ACUTE
+ (#xCA #xEA) ;; E WITH CIRCUMFLEX
+ (#xCB #xEB) ;; E WITH DIAERESIS
+ (#xCC #xEC) ;; I WITH GRAVE
+ (#xCD #xED) ;; I WITH ACUTE
+ (#xCE #xEE) ;; I WITH CIRCUMFLEX
+ (#xCF #xEF) ;; I WITH DIAERESIS
+ (#xD0 #xF0) ;; D WITH STROKE
+ (#xD1 #xF1) ;; N WITH ACUTE
+ (#xD2 #xF2) ;; O WITH GRAVE
+ (#xD3 #xF3) ;; O WITH ACUTE
+ (#xD4 #xF4) ;; O WITH CIRCUMFLEX
+ (#xD5 #xF5) ;; O WITH DOUBLE ACUTE
+ (#xD6 #xF6) ;; O WITH DIAERESIS
+ (#xD7 #xF7) ;; S WITH ACUTE
+ (#xD8 #xF8) ;; U WITH DOUBLE ACUTE
+ (#xD9 #xF9) ;; U WITH GRAVE
+ (#xDA #xFA) ;; U WITH ACUTE
+ (#xDB #xFB) ;; U WITH CIRCUMFLEX
+ (#xDC #xFC) ;; U WITH DIAERESIS
+ (#xDD #xFD) ;; E WITH OGONEK
+ (#xDE #xFE))) ;; T WITH COMMA BELOW
+
+;; Add a coding system for ISO 8859-16.
+(make-coding-system
+ 'iso-8859-16 'iso2022 "MIME ISO-8859-16"
+ '(charset-g0 ascii
+ charset-g1 latin-iso8859-16
+ charset-g2 t ; grrr
+ charset-g3 t ; grrr
+ mnemonic "MIME/Ltn-10"))
+;; end of ISO 8859-16.
-;; This is our utility function; we don't want it in the dumped XEmacs.
+(provide 'romanian)
+
+;; Czech support originally from czech.el
+;; Author: Milan Zamazal <pdm(a)zamazal.org>
+;; Maintainer (FSF): Pavel Jan,Am(Bk <Pavel(a)Janik.cz>
+;; Maintainer (for XEmacs): David Sauer <davids(a)penguin.cz>
+
+(provide 'czech)
+
+;; Slovak support originally from slovak.el
+;; Authors: Tibor ,B)(Bimko <tibor.simko(a)fmph.uniba.sk>,
+;; Milan Zamazal <pdm(a)fi.muni.cz>
+;; Maintainer: Milan Zamazal <pdm(a)fi.muni.cz>
+
+(provide 'slovenian)
+
+;; Latin-5 (ISO-8859-9)
+
+;; Turkish (more generally Turkic.) This is identical to Latin-1, with the
+;; exception that the Icelandic-specific letters have been replaced by
+;; Turkish-specific letters. As such, we can simply copy the Latin-1 syntax
+;; table.
+
+(loop
+ for i from #xA0 to #xFF
+ with syntax-table = (standard-syntax-table)
+ do (modify-syntax-entry
+ (make-char 'latin-iso8859-9 i)
+ (string (char-syntax (make-char 'latin-iso8859-1 i)))
+ syntax-table))
+
+;; Case. #### Bug: this doesn't handle I WITH DOT ABOVE.
+(setup-case-pairs
+ 'latin-iso8859-9
+ '((#xC0 #xE0) ;; A WITH GRAVE
+ (#xC1 #xE1) ;; A WITH ACUTE
+ (#xC2 #xE2) ;; A WITH CIRCUMFLEX
+ (#xC3 #xE3) ;; A WITH TILDE
+ (#xC4 #xE4) ;; A WITH DIAERESIS
+ (#xC5 #xE5) ;; A WITH RING ABOVE
+ (#xC6 #xE6) ;; AE
+ (#xC7 #xE7) ;; C WITH CEDILLA
+ (#xC8 #xE8) ;; E WITH GRAVE
+ (#xC9 #xE9) ;; E WITH ACUTE
+ (#xCB #xEB) ;; E WITH DIAERESIS
+ (#xCD #xED) ;; I WITH ACUTE
+ (#xCE #xEE) ;; I WITH CIRCUMFLEX
+ (#xD0 #xF0) ;; G WITH BREVE
+ (#xD1 #xF1) ;; N WITH TILDE
+ (#xD2 #xF2) ;; O WITH GRAVE
+ (#xD3 #xF3) ;; O WITH ACUTE
+ (#xD4 #xF4) ;; O WITH CIRCUMFLEX
+ (#xD5 #xF5) ;; O WITH TILDE
+ (#xD6 #xF6) ;; O WITH DIAERESIS
+ (#xD8 #xF8) ;; O WITH STROKE
+ (#xD9 #xF9) ;; U WITH GRAVE
+ (#xDA #xFA) ;; U WITH ACUTE
+ (#xDB #xFB) ;; U WITH CIRCUMFLEX
+ (#xDC #xFC) ;; U WITH DIAERESIS
+ (#xDE #xFE))) ;; S WITH CEDILLA
+
+(make-coding-system
+ 'iso-8859-9 'iso2022 "ISO-8859-9 (Latin-5)"
+ '(charset-g0 ascii
+ charset-g1 latin-iso8859-9
+ charset-g2 t
+ charset-g3 t
+ mnemonic "MIME/Ltn-5"))
+
+;; end of ISO-8859-9
+
+;; This is a utility function; we don't want it in the dumped XEmacs.
+
(fmakunbound 'setup-case-pairs)
+
+
+;; Language environments.
+(loop
+ for ((charset codesys default-input nice-charset-1 nice-charset-2
+ ;; supported-langs is a list if the doc string is replaced
+ ;; entirely
+ supported-langs)
+ langenvs) in
+ '(((latin-iso8859-1 iso-8859-1 "latin-1-prefix" "Latin-1"
"ISO-8859-1"
+" Danish, Dutch, English, Faeroese, Finnish, French, German, Icelandic,
+ Irish, Italian, Norwegian, Portuguese, Spanish, and Swedish.")
+ (("Danish" "da")
+ ("Dutch" "nl" "TUTORIAL.nl")
+ ("Faeroese")
+ ("Finnish" "fi")
+ ("French" "fr" "TUTORIAL.fr" "Bonjour, ,Ag(Ba
va?")
+ ("German" "de" "TUTORIAL.de" "\
+German (Deutsch Nord) Guten Tag
+German (Deutsch S,A|(Bd) Gr,A|_(B Gott"
+ "german-postfix")
+ ("Icelandic" "is")
+ ("Irish" "ga")
+ ("Italian" "it")
+ ("Norwegian" "no" "TUTORIAL.no")
+ ("Portuguese" "pt" nil "Bem-vindo! Tudo bem?")
+ ("Spanish" "es" "TUTORIAL.es"
",A!(BHola!")
+ ("Swedish" "sv" "TUTORIAL.se" "Hej!")))
+ ((latin-iso8859-15 iso-8859-15 "latin-1-prefix" ;; #### FIXME
+ "Latin-9" "ISO-8859-15")
+ ())
+ ((latin-iso8859-2 iso-8859-2 "latin-2-prefix" "Latin-2"
"ISO-8859-2"
+" Albanian, Czech, English, German, Hungarian, Polish, Romanian,
+ Serbian, Croatian, Slovak, Slovene, Sorbian (upper and lower),
+ and Swedish.") ;; " added because fontification got screwed up, CVS-20061203.
+ (("Albanian" nil)
+ ("Croatian" ("hrvatski" "hr")
"TUTORIAL.hr")
+ ("Czech" ("cs" "cz") "TUTORIAL.cs"
"P,Bx(Bejeme v,Ba(Bm hezk,B}(B den!"
+ "latin-2-postfix")
+ ("Hungarian" ("hungarian" "hu"))
+ ("Polish" "po" "TUTORIAL.pl")
+ ("Romanian" "ro" "TUTORIAL.ro" "Bun,Bc(B ziua,
bine a,B~(Bi venit!"
+ "latin-2-postfix")
+ ("Serbian" "sr")
+ ("Slovak" "sk" "TUTORIAL.sk" "Prajeme V,Ba(Bm
pr,Bm(Bjemn,B}(B de,Br(B!"
+ "latin-2-postfix")
+ ("Slovenian" "sl" "TUTORIAL.sl" ",B.(Belimo
vam uspe,B9(Ben dan!"
+ "latin-2-postfix")
+ ("Sorbian" nil)))
+ ((latin-iso8859-3 iso-8859-3 "latin-3-prefix" "Latin-3"
"ISO-8859-3"
+" Afrikaans, Catalan, Dutch, English, Esperanto, French, Galician,
+ German, Italian, Maltese, Spanish, and Turkish.")
+ (("Afrikaans" "af")
+ ("Catalan" ("catalan" "ca"))
+ ("Esperanto")
+ ("Galician")
+ ("Maltese")))
+ ((latin-iso8859-4 iso-8859-4 "latin-4-prefix" "Latin-4"
"ISO-8859-4"
+" Danish, English, Estonian, Finnish, German, Greenlandic, Lappish,
+ Latvian, Lithuanian, and Norwegian.")
+ (("Estonian" "et")
+ ("Greenlandic")
+ ("Lappish")
+ ("Latvian" "lv")
+ ("Lithuanian" "li")))
+ ((latin-iso8859-5 iso-8859-9 "latin-5-prefix" "Latin-5"
"ISO-8859-9")
+ (("Turkish" "tr"))))
+ do
+ (set-language-info-alist
+ nice-charset-1
+ `((charset ascii ,charset)
+ (coding-system ,codesys)
+ (coding-priority ,codesys)
+ (native-coding-system ,codesys)
+ (documentation . ,(if (listp supported-langs) (car supported-langs)
+ (format "\
+Generic language environment for %s (%s)." nice-charset-1 nice-charset-2))))
+ '("European"))
+ (loop for (name locale tutorial sample-text input-method) in langenvs
+ do
+ (set-language-info-alist
+ name
+ `((charset ascii ,charset)
+ (coding-system ,codesys)
+ (coding-priority ,codesys)
+ (native-coding-system ,codesys)
+ ,@(if locale `((locale . ,locale)))
+ ,@(if tutorial `((tutorial . ,tutorial)))
+ ,@(if sample-text `((sample-text . ,sample-text)))
+ (input-method . ,(or input-method default-input))
+ (documentation . ,(format "\
+This language environment supports %s. " name)))
+ '("European"))))
+
+;;; latin.el ends here
Index: lisp/mule/misc-lang.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/mule/misc-lang.el,v
retrieving revision 1.6
diff -u -u -r1.6 misc-lang.el
--- lisp/mule/misc-lang.el 2006/11/05 22:31:38 1.6
+++ lisp/mule/misc-lang.el 2006/12/21 22:56:55
@@ -41,16 +41,6 @@
final ?0
graphic 1
short-name "IPA"
- long-name "IPA"
- ))
-
-(set-language-info-alist
- "IPA" '((charset . (ipa))
- (coding-priority iso-2022-7bit)
- (coding-system iso-2022-7bit)
- (input-method . "ipa")
- (documentation . "\
-IPA is International Phonetic Alphabet for English, French, German
-and Italian.")))
+ long-name "IPA"))
;;; misc-lang.el ends here
Index: lisp/mule/mule-cmds.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/mule/mule-cmds.el,v
retrieving revision 1.29
diff -u -u -r1.29 mule-cmds.el
--- lisp/mule/mule-cmds.el 2006/12/11 12:40:02 1.29
+++ lisp/mule/mule-cmds.el 2006/12/21 22:56:56
@@ -1068,6 +1068,9 @@
`native-coding-system' and `coding-system'. The name of the new language
environment is the name of the old language environment, followed by
CODING-SYSTEM in parentheses. Returns the name of the new language
+environment.
+
+This function also modifies the `coding-priority' of a language
environment. "
(check-coding-system coding-system)
(if (symbolp langenv) (setq langenv (symbol-name langenv)))
@@ -1083,10 +1086,15 @@
(format "%s (%s)" (car langenv)
(upcase (symbol-name (coding-system-name coding-system)))))
(destructive-plist-to-alist
- (plist-put (plist-put (alist-to-plist (cdr langenv)) 'native-coding-system
- coding-system) 'coding-system
- (cons coding-system
- (cdr (assoc 'coding-system (cdr langenv))))))))
+ (plist-put
+ (plist-put
+ (plist-put (alist-to-plist (cdr langenv))
+ 'native-coding-system
+ coding-system)
+ 'coding-system (cons coding-system
+ (cdr (assoc 'coding-system (cdr langenv)))))
+ 'coding-priority (cons (coding-system-category coding-system)
+ (cdr (assq 'coding-priority (cdr langenv))))))))
(defun get-language-environment-from-locale (locale)
"Convert LOCALE into a language environment.
@@ -1099,7 +1107,7 @@
(desired-coding-system
(and charset (gethash (replace-in-string charset "[^a-z0-9]"
"")
posix-charset-to-coding-system-hash)))
- lang locs)
+ lang locs given-coding-system)
(dolist (langcons language-info-alist)
(setq lang (car langcons)
locs (get-language-info lang 'locale))
@@ -1114,10 +1122,14 @@
locale))
(if (or (null desired-coding-system)
(and desired-coding-system
- (eq desired-coding-system
- (get-language-info
- lang
- 'native-coding-system))))
+ (or (eq desired-coding-system
+ (setq given-coding-system
+ (get-language-info
+ lang
+ 'native-coding-system)))
+ (and (listp given-coding-system)
+ (memq desired-coding-system
+ given-coding-system)))))
(return-from langenv lang)
(return-from langenv
(create-variant-language-environment
Index: src/file-coding.c
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/src/file-coding.c,v
retrieving revision 1.55
diff -u -u -r1.55 file-coding.c
--- src/file-coding.c 2006/05/25 08:04:57 1.55
+++ src/file-coding.c 2006/12/21 22:56:58
@@ -1310,7 +1310,7 @@
The following additional properties are recognized if TYPE is `unicode':
-`type'
+`unicode-type'
One of `utf-16', `utf-8', `ucs-4', or `utf-7' (the latter is not
yet implemented). `utf-16' is the basic two-byte encoding;
`ucs-4' is the four-byte encoding; `utf-8' is an ASCII-compatible
Index: src/general-slots.h
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/src/general-slots.h,v
retrieving revision 1.18
diff -u -u -r1.18 general-slots.h
--- src/general-slots.h 2006/11/05 22:31:44 1.18
+++ src/general-slots.h 2006/12/21 22:56:58
@@ -289,6 +289,7 @@
SYMBOL (Qundefined);
SYMBOL (Qunimplemented);
SYMBOL (Qunicode_registries);
+SYMBOL (Qunicode_type);
SYMBOL (Quser_default);
SYMBOL_KEYWORD (Q_value);
SYMBOL (Qvalue_assoc);
Index: src/intl-win32.c
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/src/intl-win32.c,v
retrieving revision 1.17
diff -u -u -r1.17 intl-win32.c
--- src/intl-win32.c 2006/11/01 20:25:50 1.17
+++ src/intl-win32.c 2006/12/21 22:56:59
@@ -783,7 +783,7 @@
If SUBLANG is omitted, "SUBLANG_DEFAULT" is used.
Recognized language names are
-(some may not be recognized if the compiler is older than VC++ 6.0)
+\(some may not be recognized if the compiler is older than VC++ 6.0)
"AFRIKAANS"
"ALBANIAN"
@@ -858,7 +858,7 @@
"VIETNAMESE"
Recognized sub-language names are
-(some may not be recognized if the compiler is older than VC++ 6.0)
+\(some may not be recognized if the compiler is older than VC++ 6.0)
"ARABIC_ALGERIA"
"ARABIC_BAHRAIN"
@@ -2358,7 +2358,7 @@
"This encoding is equivalent to standard UTF16, little-endian."
),
Qmnemonic, build_string ("MSW-U")),
- list4 (Qtype, Qutf_16,
+ list4 (Qunicode_type, Qutf_16,
Qlittle_endian, Qt)));
#ifdef MULE
Index: src/unicode.c
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/src/unicode.c,v
retrieving revision 1.35
diff -u -u -r1.35 unicode.c
--- src/unicode.c 2006/11/05 22:31:46 1.35
+++ src/unicode.c 2006/12/21 22:57:00
@@ -2438,7 +2438,7 @@
static int
unicode_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value)
{
- if (EQ (key, Qtype))
+ if (EQ (key, Qunicode_type))
{
enum unicode_type type;
@@ -2467,7 +2467,7 @@
static Lisp_Object
unicode_getprop (Lisp_Object coding_system, Lisp_Object prop)
{
- if (EQ (prop, Qtype))
+ if (EQ (prop, Qunicode_type))
{
switch (XCODING_SYSTEM_UNICODE_TYPE (coding_system))
{
@@ -2489,7 +2489,8 @@
unicode_print (Lisp_Object cs, Lisp_Object printcharfun,
int UNUSED (escapeflag))
{
- write_fmt_string_lisp (printcharfun, "(%s", 1, unicode_getprop (cs, Qtype));
+ write_fmt_string_lisp (printcharfun, "(%s", 1,
+ unicode_getprop (cs, Qunicode_type));
if (XCODING_SYSTEM_UNICODE_LITTLE_ENDIAN (cs))
write_c_string (printcharfun, ", little-endian");
if (XCODING_SYSTEM_UNICODE_NEED_BOM (cs))
--
When I was in the scouts, the leader told me to pitch a tent. I couldn't
find any pitch, so I used creosote.
_______________________________________________
XEmacs-Patches mailing list
XEmacs-Patches(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-patches