[ Lacking context for this, I'm having a hard time snipping ]
[ irrelevant quoted material. Sorry about the excess quoting. ]
On Wed, 2003-01-29 at 03:25, Stephen J. Turnbull wrote:
>>>>> "SY" == Steve Youngs
<youngs(a)xemacs.org> writes:
SY> This also fixes some horrible bogosity introduced because this
SY> file's coding system was set to "MSW UTF8". OMG, M$ Win is a
SY> truly evil abomination.
All that means is that the UTF-8 signature is prepended to the file.
This is _not Microsoft_, it is the international standard
(incorrectly, IMO, but better a bad standard than no standard)
recommended by both Unicode (ever since UTF-8 was introduced) and
ISO-10646 (ditto). (Of course Microsoft is responsible for
introducing and supporting "little-endian Unicode" and other
abominations as well, but that doesn't change the fact that the
committees are not controlled by Microsoft, and they did approve
them.)
This needs to be fixed (XML prohibits the signature since UTF-8 is the
default, IIRC), but that's what you get for using 21.5 for daily work.
If the signature you are refering to is the UNICODE Byte-Order Mark then
XML 1.0 Second Edition contained an erratum that clarified that the BOM
may appear as an encoding signature also in UTF-8 encoded XML instances.
See <
http://www.w3.org/XML/xml-V10-2e-errata#E22>.
For UTF-16 encoded XML the BOM is required.