Re: [edit-utils] Handling Japanese man page

Wednesday, 5 April 2000

        "Stephen J. Turnbull" <turnbull(a)sk.tsukuba.ac.jp&gt; writes:

...
 "No harm" is assuming that the nroff convention is always
one ^H per
 hankaku character.  You're probably right, but are you _sure_? 
No.  But the regexp below doesn't match those characters.
(while (re-search-forward "\\(\\cj\\)\\(\b\b\\1\\)" nil t)

"Stephen J. Turnbull" <turnbull(a)sk.tsukuba.ac.jp&gt; writes:

...
     Hrvoje> Ideally, XEmacs should just do the right thing, either
by
     Hrvoje> hardcoding the stuff or by guessing it.  We should
     Hrvoje> generally be careful about introducing new user-options.

 We don't have the choice of "just working" here.  We don't have
 Chinese or Koreans to ask at the moment, and we don't know whether
 there might be other languages that make the bogus width distinction
 that Japanese does.

 The defcustom (1) gives Japanese users who know that they'll be
 looking mostly at non-Japanese pages the chance to defeat what you say
 is an expensive operation, (2) allows non-Japanese whose situations we
 don't know yet to make the appropriate choice for themselves
 (including the fact that custom is self-documenting, and so they might
 actually find it while looking in custom for some option to fix bad
 behavior), and (3) documents what we did for the future day when the
 *roff folks remove the bogus width distinction. 
So you are proposing to change regexp to
\\(.\\)\\(\b\b\\1\\) and add an option to disable it?  I'd
prefer to use \\(\\cj\\|\\ck\\|\\cc\\)\\(\b\b\\1\\) if
Chinese and Korean man pages are also broken.  It minimize
the possibility of wrong guess.  At least, my version *just
works* for Japanese.  Searching for any character followed by
two ^H is too error prone, IMO.  I don't know the best
compromise for speed problem.  The patch surely slows down,
but for large man pages it is already too slow...

-- 
Yoshiki Hayashi

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: [edit-utils] Handling Japanese man page