Ar an tríú lá is fiche de mí Iúil, scríobh Stephen J. Turnbull:
Aidan Kehoe writes:
> > And what does
> > (encode-coding-string (make-char 'japanese-jisx0208 48 108)
> > 'koi8-r)
> > do?
> The right thing; it returns a string consisting of a tilde.
That's what our coding systems currently do, but that's the wrong
thing; it should throw an error, with the current state of the
encoding process available to condition-case.
That is the right thing, some of the time. But I do not want my TTY XEmacs
tied up with an error message (that I can’t see, because trying to display
it leads to another, and another ...) just because some character in the
selected window cannot be encoded using UTF-8, my console-coding-system. And
none of the code out there is prepared to handle these errors, and won’t be,
since GNU have taken the safe-charsets approach. We need a separate API.
> It seems to me that an API like
> (query-coding-region START END CODING-SYSTEM &optional BUFFER)
> returning, say, a list of buffer offsets and lengths, is the most
> appropriate general way to implement a UI for warning that a given coding
> system will not encode a given buffer.
Well, since this shouldn't actually be happening :-) (and in practice
is fairly unusual even for most European users, I believe),
No it’s not. http://mid.gmane.org/f2g834$sds$1＠sea.gmane.org
. I could trawl the lists
some more if you want.
I think use of a well-designed exception mechanism is to be preferred
explicit tests (that most code will fail to do) in the long run.
That exception mechanism can’t be turned on by default, for the sake of TTYs
and for the sake of conversion in redisplay, which happens for every
character set on Win32, XFT and on OS X. We don’t require that every
character in every character set have a Unicode mapping.
> > Is there a reason why this technique should be restricted
> > systems currently implemented in CCL, or could/should we replace
> > all ISO 8859 coding systems with this stuff?
> Well, latin-unity deals with that problem for the 8859 coding systems,
> and in a way that’s compatible with 21.4, so I don’t necessarily see
> any reason to change that.
Yeah, except latin-unity sucks for a lot of reasons you're aware of.
Including performance, not to mention UI, and charset coverage.
Well, my main objection to it is that it’s not turned on by default in the
appropriate locales. I’m happy with its UI and performance; its charset
coverage could be better, sure.
It would be appropriate to move iso-8859-7 to being this kind of coding
system, I think, since the Greeks don’t want ISO-2022 encoding either, and
they will benefit from the unification of the punctuation characters. I’m
also inclined to add MacRoman, MacGreek and MacCyrillic, and perhaps EBCDIC.
> > > These coding systems are much faster than that
> > I don't think it's worth worrying about speed of coding systems
> > somebody complains. AFAIK nobody's complained about the *speed* of
> > mule-ucs, so I doubt they'll complain about this either.
> Spoken like a true Lisper :-) .
Please, I really don't need to deal with this kind of humor. I mentioned
coding systems and Mule-UCS, I meant coding systems and Mule-UCS. XEmacs
21.5 has lots of speed issues in redisplay and font-lock. To the best of
my knowledge, however, the only coding- related code with efficiency
problems at present is latin-unity.
Do you know anything to the contrary?
Mule-UCS was faster than the 21.5 utf-8 implementation for me in
practice--I’ve just double-checked this impression. Hrvoje had a big problem
with its memory usage, which is also an aspect of performance.
These generated coding systems have two vectors full of immediate values,
ballpark 2.5K, and a hash table full of 256 immediate values, ballpark 2K,
as their implementation data. This is more than the old CCL coding systems
had, but not anything in the same ball park as the 10MB of the Mule-UCS
package, though it doesn’t make sense to directly compare them.
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
XEmacs-Patches mailing list