>>>> "Hrvoje" == Hrvoje Niksic
<hniksic(a)xemacs.org> writes:
Hrvoje> I'm a bit hazy on the concept of tracking "language". How
Hrvoje> is that supposed to work, exactly?
For you, *much* better than current Mule. :-)
Hrvoje> I mean, a word processor can do it because it has a chance
Hrvoje> to save its markup when saving the document. Emacs works,
Hrvoje> in most cases, with bare characters,
Bare character case. For almost all users in most cases documents
will be monolingual, and in the first go-round we will disambiguate
German from Croatian in the same way that we disambiguate ISO 8859-15
from ISO 8859-2 now---query the environment and if necessary the user.
The second stage of development is to use the classifier techniques
Ben has advocated to autodetect the language.
For casually multilingual documents, we'll just use a single priority
list of fonts and go down the list until we have a font that has the
character. We should also provide a mark region---evoke menu---select
language UI. For plaintext Unicode documents, we should offer to save
with Plane 14 language tags. These MUST be ignored by conforming
applications that don't handle them, but of course they'll screw up
byte counts and digest hashes, so we'll have to allow them to be
disabled. Not to mention the odds that applications conform to
something so esoteric aren't great.
Serious multilingual users will have their own ways of marking
language in stored documents, we need to provide an API to make it
easy to parse those and markup the buffer.
Hrvoje> or with charset (not language) annotations, as is the
Hrvoje> case with coding cookies or with Gnus processing MIME
Hrvoje> messages.
Charset -> language heuristics are easy and moderately reliable. If
that fails, fontconfig (for example) provides an API which queries the
font for "can you handle this repertoire of characters". Then the
worst that can happen with either heuristic is that a font the user
thinks looks pretty nice for some other language will shadow the font
that the user thinks is optimal for this language.
We can also provide a Mule and/or MIME charset->font mapping facility
if needed.
--
School of Systems and Information Engineering
http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Ask not how you can "do" free software business;
ask what your business can "do for" free software.