Hello Ben,
I thought you may find the information below helpful.
Ben Wing <ben(a)666.com> writes:
[3] run the trainer -- once -- on all the data. it may take awhile,
but this only happens once, and afterwards, we should have a damn good
encoding autodetection process.
,----
| Internet Explorer actually does something quite interesting: it tries to
| guess, based on the frequency in which various bytes appear in typical text in
| typical encodings of various languages, what language and encoding was
| used. Because the various old 8 byte code pages tended to put their national
| letters in different ranges between 128 and 255, and because every human
| language has a different characteristic histogram of letter usage, this
| actually has a chance of working.
|
`----
Quoted from
http://www.joelonsoftware.com/articles/Unicode.html
Regards.
--
Surendra Singhi
http://www.public.asu.edu/~sksinghi/index.html
"All animals are equal, but some animals are more equal than others."
- Orwell, Animal Farm, 1945