Hello Ben,
  I thought you may find the information below helpful. 
Ben Wing <ben(a)666.com> writes:
 [3] run the trainer -- once -- on all the data.  it may take awhile,
 but this only happens once, and afterwards, we should have a damn good
 encoding autodetection process.  
,----
| Internet Explorer actually does something quite interesting: it tries to
| guess, based on the frequency in which various bytes appear in typical text in
| typical encodings of various languages, what language and encoding was
| used. Because the various old 8 byte code pages tended to put their national
| letters in different ranges between 128 and 255, and because every human
| language has a different characteristic histogram of letter usage, this
| actually has a chance of working. 
| 
`----
Quoted from 
http://www.joelonsoftware.com/articles/Unicode.html
Regards.
-- 
Surendra Singhi
http://www.public.asu.edu/~sksinghi/index.html
"All animals are equal, but some animals are more equal than others."
    - Orwell, Animal Farm, 1945