"Stephen J. Turnbull" <turnbull(a)sk.tsukuba.ac.jp> writes:
"No harm" is assuming that the nroff convention is always
one ^H per
hankaku character. You're probably right, but are you _sure_?
No. But the regexp below doesn't match those characters.
(while (re-search-forward "\\(\\cj\\)\\(\b\b\\1\\)" nil t)
"Stephen J. Turnbull" <turnbull(a)sk.tsukuba.ac.jp> writes:
Hrvoje> Ideally, XEmacs should just do the right thing, either
by
Hrvoje> hardcoding the stuff or by guessing it. We should
Hrvoje> generally be careful about introducing new user-options.
We don't have the choice of "just working" here. We don't have
Chinese or Koreans to ask at the moment, and we don't know whether
there might be other languages that make the bogus width distinction
that Japanese does.
The defcustom (1) gives Japanese users who know that they'll be
looking mostly at non-Japanese pages the chance to defeat what you say
is an expensive operation, (2) allows non-Japanese whose situations we
don't know yet to make the appropriate choice for themselves
(including the fact that custom is self-documenting, and so they might
actually find it while looking in custom for some option to fix bad
behavior), and (3) documents what we did for the future day when the
*roff folks remove the bogus width distinction.
So you are proposing to change regexp to
\\(.\\)\\(\b\b\\1\\) and add an option to disable it? I'd
prefer to use \\(\\cj\\|\\ck\\|\\cc\\)\\(\b\b\\1\\) if
Chinese and Korean man pages are also broken. It minimize
the possibility of wrong guess. At least, my version *just
works* for Japanese. Searching for any character followed by
two ^H is too error prone, IMO. I don't know the best
compromise for speed problem. The patch surely slows down,
but for large man pages it is already too slow...
--
Yoshiki Hayashi