>>>> "SJT" == Stephen J Turnbull
<stephen(a)xemacs.org> writes: 
 SJT> Well, as I suppose you know, what Ben has is mind is a statistical
 SJT> detector that (eg) can distinguish EUC-JP from EUC-TW or EUC-KR
 SJT> (although the really important case is the ISO-8859-X mess and the
 SJT> various non-conforming sets like KOI8 and the Windows 12xx sets).
I can say that there is an extremely good scheme for statistical detection
of various Russian (really Russian, not Cyrillic) encodings, done by
S. V. Znamensky.  I tried it, and it works really wonderful, allowing even
"twice-encoded" text which is seen occasionally.
I thought of adding something like this to XEmacs.  Now if there is a
common infrastructure for this, I'd be glad to help in that area.
 SJT> But the design is completely new, so we need to retune it.  Also
 SJT> there seem to be some bugs in coding priorities.
Uhm,
I'm now playing with current XEmacs-beta.  It recognizes my
~/.xemacs/init.el as UTF-16, and does not let me to change the encoding
with "C-x RET f koi8-r RET" (but "C-x RET c koi8-r C-x C-f" works).  
The
file itself is mostly ASCII, with two strings in Russian inside (near the
end of the file).
Are you interested in such bug reports, and if yes, should I send the file
or what?  Other files it at least detects as "Raw".
set-language-enviroment Cyrillic-KOI8 does not help at all.
--alexm