Stephen J. Turnbull wrote:
> Aidan Kehoe writes:
>
> > Reliably differentiating those [bit-for-bit identical] pairs
> > without metadata is not possible.
>
> And unnecessary, precisely because it's not possible. It's only when
> you add disambiguating text that it matters, and then the Sufficiently
> Smart Program argument applies.
I hate to disturb your conversation -- but I want to add the voice
of one European (German) user: Aidan expresses the problem, and you
seem to reject work on it to get a Really Good Design. But it ain't
so easy, from this user's perspective -- sometimes Worse Is Better,
as we all have learned from Dick Gabriel.
It is highly annoying that I need to explicitly, manually, specify
the buffer-coding-system for a file when I want to insert an EUR
sign and when I want it to end up encoded in ISO-8859-15. And
that's the current state of affairs, AFAICS.
Since many of us use templates for new files anyhow, it's a small
matter to add an appropriate cookie to demand a specific encoding
for a file. That this approach is effectively rejected by you, is
really disturbing for us European users. Maybe one doesn't need it
for Japanese scripts and for UTF-8, but for us poor souls who still
want to use ISO-8859-* for other reasons, it would be a real advantage.
Well, speaking of UTF-8: Since XEmacs is very happy to destroy lots
of my files with its supposedly smart encoding detection -- and
does so WITHOUT any warning -- your demand for more thought about
error recovery is right on spot. But there seems to be other areas
that are in more severe need of that error handling than recovery
from wrong coding cookies (namely, automatic encoding sniffing). I
have had literally dozens of UTF-8 files with one single Latin1
char in them, that got reencoded by XEmacs when I opened and saved
them. (I.e., when I didn't pay enough attention to the modeline in
the process of quickly modifying one or two words.) In all these
cases, reliance on a user-supplied coding cookie would have saved
me untold hours and hours of work to redo the result of XEmacs
automatic encoding detection which Really Really Really Sucks.
Throwing an error if the coding system is not sufficient is much
preferred to the current state of affairs (arbitrarily choosing a
coding system that XEmacs thinks is right).
Sorry about sounding frustrated -- but I am, and your insistence on
Sufficiently Smart Programs(tm) being the Right Way(tm) does not
match my experience with XEmacs, or with programs at all. Please
let us have our way to specify "yes, I want the buffer of this file
in THIS coding system and be DAMNED if you try to switch on another
one beneath my back. I know what I'm doing because I'M THE USER and
I'M ALWAYS RIGHT BY DEFINITION because I know more about MY OWN
FILES than you bloody stupid detection algorithm."
Please.
Joachim
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Joachim Schrod Email: jschrod(a)acm.org
Roedermark, Germany
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
Ar an seachtú lá déag de mí Bealtaine, scríobh Stephen J. Turnbull:
> Aidan Kehoe writes:
>
> > Reliably differentiating those [bit-for-bit identical] pairs
> > without metadata is not possible.
>
> And unnecessary, precisely because it's not possible.
With metadata, it is. Which is why I want useful metadata.
> It's only when you add disambiguating text that it matters, and then the
> Sufficiently Smart Program argument applies.
If you assume that no other program is processing the file, then yes. Which
is not a very useful starting point to people who do other things than hack
XEmacs.
If you’re working on a website, and you’ve configured Apache to send
ISO-8859-15 headers for every file served, and you’re editing a PHP file
with character literals, it is not appropriate for XEmacs to encode the Euro
sign as a tilde because it thinks the file is in ISO-8859-1.
> > Gzip and bzip2 legitimately don’t try to undo damage to files that
> > have been corrupted by FTP, and no-one expects them to.
> > Encouraging data corruption is not a particularly worthy goal,
>
> *sigh* Translating a file from ISO-8859-5 to KOI8-R is not data
> corruption any more than compressing a file with gzip is.
ISO-8859-5 and KOI8-R have differing repertoires. Unconditionally
translating from the former to the latter loses data. Unconditionally
translating from the latter to the former loses data. That is corruption.
> Unless there's a protocol that labels the file with its encoding, and
> some programs respect it while others ignore it. The latter may
> "corrupt" the file by translating, or by adding new data to a file that
> originally conformed to both encodings.
>
> Which is precisely the possibility that you propose to introduce. It
> is therefore your responsibility to at least think about ways to deal
> with the inevitable corruption, ie, incorrect cookies.
Warnings, warnings, warnings. GNU has them.
> I know why you want cookies; I don't deny that they are a useful
> private protocol. I just don't want you imposing the *known* problems
> with them on others without thinking about how to alleviate those
> problems.
Imposing? You’re aware they’re already there, right?
--
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
Ar an cúigiú lá déag de mí Bealtaine, scríobh Stephen J. Turnbull:
> Aidan Kehoe writes:
>
> > I want to enforce something I already know on XEmacs,
>
> I know what you want. I'm not telling you you can't have your
> Ferrari, I'm telling you that if you want to put it in the mainline,
> it needs to have a child seat.
>
> > > No, from the point of view of our workflow, it's not the same.
> > > Autodetection failures are in principle our responsibility, and in
> > > theory can be reduced to near zero by Sufficiently Smart Programming.
> >
> > Spoken like an East Asian. No they can’t. Trivial examples;
>
> Aidan, if I bet you €100 you couldn't do it, I'm sure I'd get an
> algorithm by return mail. No?
No. German in iso-8859-1, iso-8859-2, iso-8859-15 and iso-8859-16 without
using the Euro sign is bit-for-bit identical. Russian using KOI8-U (which
Ukrainians have occasion to do, the entire country is bilingual) is
bit-for-bit identical to the same Russian using KOI8-R. Reliably
differentiating those pairs without metadata is not possible.
> So let's calm down and put together a design that works for pretty
> much everybody, including folks with colleagues using bogus local
> versions of dos2unix.
Note that every solution will have some user behaviour where the correct
reaction for us is the frequent one of the MD confronted with a patient
whose complaint is ‘Doctor, it hurts when I do this’; ‘Then don’t do that.’
Gzip and bzip2 legitimately don’t try to undo damage to files that have been
corrupted by FTP, and no-one expects them to. Encouraging data corruption is
not a particularly worthy goal, especially in this day and age when Unix
programs can read Shift JIS.
--
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
Hi,
I've just released GNU Hyperbole 5.0. It is available for download at
"ftp://ftp.gnu.org/gnu/hyperbole"
Consequently I would want to make the switch and replace the current
version in the package tree, 4.18, with version 5.0, in order to
maintain one version of the package.
Why would you object? Well since 5.0 is based on version 4.01 there
are possibly a few things that will be broken or missing after the
change. The export function in kotl-mode is one example but there
might be others.
Still I think 5.0 is the way to go. Bug reports about missing
functions will of course get high priority to ease transition. The
reason why not everything is synced is due to my ignorance to only
support things I use and the thought that now, ten years later, things
might be solved differently.
Yours
--
%% Mats
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
Ar an cúigiú lá déag de mí Bealtaine, scríobh Stephen J. Turnbull:
> Aidan Kehoe writes:
> >
> > Ar an cúigiú lá déag de mí Bealtaine, scríobh Stephen J. Turnbull:
> >
> > > In XML, it's strictly enforced by validating parsers;
> >
> > Right, but there's no guarantee a validating parser has been run on a given
> > XML file that XEmacs sees.
>
> Sure. My point is that an XML charset declaration that is wrong is an
> error, and by definition other software can treat that case as
> deprecated. Users who don't know the rules, well, too bad. But when
> there is no rule outside of XEmacs, it's not reasonable to expect the
> users to enforce it on their environment.
I want to enforce something I already know on XEmacs, in a way that will
work on other machines that don’t have my value of file-coding-system-alist.
In a way that won’t interfere with command line parsing, as happens on
Cygwin currently:
/usr/bin/env: perl # -*- coding: windows-1251 -*-: No such file or directory
Having it work on GNU Emacs too would be gravy.
> > > We don't *need* to do anything; the user can always go back and
> > > re-read the file with C-u C-x C-f.
> >
> > Oh, don't be obtuse. On that reasoning, we can abandon development
> > right now since anyone who would like a feature we don't have can
> > implement it themselves.
>
> I'm not being obtuse. Lack of cookie support is not a security or
> data loss issue (by itself), nor is it a bug at all; as you point out,
> it's a missing feature. We don't *need* to do anything.
Only in the sense I gave above.
> So let's take the time to not introduce new ways to fail.
Everything we do introduces new ways to fail. The interesting thing is the
tradeoff between those and the positive aspects of what we do.
> If the work to prevent cookie lossage helps deal with our existing
> sterile equine behavior, so much the better.
>
> > > No, that is an error mode of sniffing from cookies that we *create*
> > > for our users by parsing cookies that are not enforced by the
> > > protocol we're editing,
> >
> > The same error mode arises when the automatic coding detection
> > (independent of coding cookies) gets something wrong. It's distinct
> > from sniffing cookies.
>
> No, from the point of view of our workflow, it's not the same.
> Autodetection failures are in principle our responsibility, and in
> theory can be reduced to near zero by Sufficiently Smart Programming.
Spoken like an East Asian. No they can’t. Trivial examples; deciding between
iso-8859-1 vs. iso-8859-15. Between KOI8-U used for Russian vs. KOI8-R.
iso-8859-2 used for German, vs iso-8859-1 used for German (bit-for-bit
identical, until someone types the word Škoda, which Germans have occasion
to do now and then).
> OTOH, the rate of broken cookies was quite high in the early days of
> Nihongo Emacs; I didn't stop seeing them until cookies themselves
> disappeared (in modern MULE, distinguishing among the 4 common
> Japanese encodings is essentially perfect). The problem was that
> people would do things like modify dos2unix to also recode Shit JIS as
> EUC-JP, but not to recognize Emacs coding cookies (a reasonable
> behavior for vi users, you will admit, I hope). So you'd get whole
> trees full of EUC-JP marked as Shit JIS.
Japan; the land of transcoding HTTP proxies. The land where backslash is
written as Y with two lines through it. The land of government standards for
character sets that left encodings underspecified.
Okay, so we don’t need coding cookies in Japan today. That doesn’t help
Europe a whole lot.
> > If you can find it without too much searching, I'd appreciate a
> > link to what Ilya posted. Otherwise I suppose I can ask him.
>
> Will do.
Thanks.
> I gather you're not in a "by 10pm tonight" kind of hurry?
Nope.
> (Thing is, I suspect this is in archives that aren't online, may as
> well take the time to put them online.)
--
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
I built xemacs 21.5 beta27 20070502 without trouble.
I followed http://www.xemacs.org/Documentation/packageGuide.html#Installing_automati...
and was stumped because mailcrypt has additionally mail-lib as
dependancy. The error message was
Cannot open load file: "rfc822"
when trying to load mailcrypt directly and had an %s thrown in when
the menubar was used to call the package "List and install" or "Update
package list" funcitons.
After I installed that the further package management worked like a
charm. (for the first time after building a xemacs release. I blamed
the ftp command line client for the former problems with the package
management)
The additional dependency should be documented.
Peter
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
"Stephen J. Turnbull" <stephen(a)xemacs.org> writes:
> Aidan Kehoe writes:
>
> > GNU Emacs’ coding sniffing is better than ours; for example, it responds
> > sensibly to this at the start of a file:
> >
> > #!/usr/bin/env perl
> > # -*- coding: windows-1251 -*-
>
> Please, no. IMHO, if this must be done, do a design from scratch, and
> think about how to recover from an incorrect cookie (eg, because
> somebody copied boilerplate from one mostly-ASCII file to another,
> perhaps in an editor that doesn't respect cookies). *Then* look at
> GNU's implementation and see if it makes sense in the light of a
> sensible design.
Whether or not it makes sense in the light of a sensible design: if it
is reasonably reliable, there is a chance that people will adapt the
few cases where it misses to Emacs' heuristics, and if XEmacs had
different heuristics, this would be sort of a pain in the neck.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
Ar an cúigiú lá déag de mí Bealtaine, scríobh Stephen J. Turnbull:
> In XML, it's strictly enforced by validating parsers;
Right, but there’s no guarantee a validating parser has been run on a given
XML file that XEmacs sees.
> > We need to respond sensibly to that at the start of a file. We
> > don’t currently.
>
> We don't *need* to do anything; the user can always go back and
> re-read the file with C-u C-x C-f.
Oh, don’t be obtuse. On that reasoning, we can abandon development right now
since anyone who would like a feature we don’t have can implement it
themselves.
> The question is "how can we help the judicious coder introduce a
> convenient automatic protocol without exposing naive users to abuse from
> the kind of guy who would smoke in a maternity ward?"
Warnings, I think.
> > > IMHO, if this must be done, do a design from scratch, and think about how
> > > to recover from an incorrect cookie (eg, because somebody copied
> > > boilerplate from one mostly-ASCII file to another, perhaps in an editor
> > > that doesn't respect cookies).
> >
> > That’s distinct from sniffing using coding cookies,
>
> No, that is an error mode of sniffing from cookies that we *create*
> for our users by parsing cookies that are not enforced by the protocol
> we're editing,
The same error mode arises when the automatic coding detection (independent
of coding cookies) gets something wrong. It’s distinct from sniffing
cookies.
> > > Anyway, IIRC, you already committed a half-baked patch from Ivan
> > > Golubev to do this.
> >
> > No, that wasn’t me.
>
> Well, if you decide to do this yourself, Ivan did exactly what you
> propose (port the GNU code, from 21.3 IIRC). I thought somebody
> committed it as a package, but you're right, I don't see it anywhere.
If you can find it without too much searching, I’d appreciate a link to what
Ilya posted. Otherwise I suppose I can ask him.
> I'll help with documenting existing workarounds.
--
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
Ar an cúigiú lá déag de mí Bealtaine, scríobh Stephen J. Turnbull:
> Aidan Kehoe writes:
>
> > GNU Emacs’ coding sniffing is better than ours; for example, it
> > responds sensibly to this at the start of a file:
> >
> > #!/usr/bin/env perl
> > # -*- coding: windows-1251 -*-
>
> Please, no.
Hmm? That’s how Python does it, and AIUI you had some part in their choice
of that deisgn. We need to respond sensibly to that at the start of a
file. We don’t currently.
> IMHO, if this must be done, do a design from scratch, and think about how
> to recover from an incorrect cookie (eg, because somebody copied
> boilerplate from one mostly-ASCII file to another, perhaps in an editor
> that doesn't respect cookies).
That’s distinct from sniffing using coding cookies, and there’s no general
solution to it; windows-1251 cannot be reliably distinguished from
ISO-8859-1, or even from UTF-8. (The opposite direction _is_ possible for
UTF-8, though.)
> *Then* look at GNU's implementation and see if it makes sense in the
> light of a sensible design.
>
> Anyway, IIRC, you already committed a half-baked patch from Ivan
> Golubev to do this.
No, that wasn’t me.
> > and if I understand things correctly it also sniffs XML coding
> > correctly.
>
> This is very useful, and I too would like to see it; it's been on my
> list for a while.
--
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta