Re: crash when loading file which contains EUC-JP and ISO-2022-JP

Saturday, 2 December 2000

        Stephen,
  Do you have a patch for this?

  - vin

"Stephen J. Turnbull" <turnbull(a)sk.tsukuba.ac.jp&gt; writes:
...
 >>>>> "Mike" == Mike Fabian
<mfabian(a)suse.de&gt; writes:

     Mike> LANG=ja_JP xemacs -q -vanilla kanji.euc-iso

 >>>>> "Stephen" == Stephen J Turnbull
<turnbull(a)sk.tsukuba.ac.jp&gt; writes:

     Stephen> I've reproduced this up to the Lisp backtrace in 21.1
     Stephen> (patch 12) "Channel Islands" plus some CVS updates.  I
     Stephen> should be able to take a close look at it later today.
     Stephen> It's not obvious to me what's happening here though, so I
     Stephen> can't promise a quick fix.

 Oh, boy, are things fxxked up here.

 What is happening is that the presence of undesignated characters from
 GR followed by the ISO-2022 escape sequences causes Mule to
 auto-detect the coding category as 'iso-lock-shift, and then the
 coding system itself is set to 'iso-2022-lock-unix.  There are some
 bugs in the implementation of this coding system, such that characters
 which are represented as negative Emchars by MAKE_CHAR are generated by
 decode-coding-region.  This should not happen; I believe the range of
 Emchars is still only 19 bits, so with a 30-bit character
 representation there's no excuse for wrap-around.  :-(

 I don't understand why this happens yet.  I will have some kind of
 patch in about two days (heavy class load next two days); either I'll
 make the "safe" coding-priority-list the default and document the
 problem, or I'll have a real fix.  If somebody else can do something
 useful in the interim, I'd be much obliged!

 A work-around is to put

 (defun make-coding-priority-safe ()
   "Give `no-conversion' higher priority than some buggy coding categories.

 `iso-lock-shift' is known to cause crashes in a Japanese environment in
 certain situations, and `iso-8-designate' is rare and perhaps also not
 to be trusted."
   (interactive)
   (set-coding-priority-list
     (let* ((buggy '(iso-lock-shift iso-8-designate))
            (cpl (delq 'no-conversion (coding-priority-list)))
            (ret cpl))
       (while (and (cdr cpl)
                   (not (memq (car (cdr cpl)) buggy)))
         (setq cpl (cdr cpl)))
       (setcdr cpl (cons 'no-conversion (cdr cpl)))
       ret)))

 ;; Not very safe; should check for current coding system and buffer
 ;; change status etc.
 (defun convert-buffer-using-coding-system (coding-system)
   "Convert the whole buffer according to CODING-SYSTEM.

 Should only be called on an unnarrowed binary buffer with a known
 external encoding.  Any other use will have undefined results."
   (interactive "SCoding system: ")
   (if (coding-system-p coding-system)
       (decode-coding-region (point-min) (point-max) coding-system)
     (error "You bozo!  Try again, with a REAL coding system this time.")))

 (make-coding-priority-safe)

 in ~/.emacs and to call it any time you change your language
 environment or the function `set-coding-priority-list' by hand.  Those
 are the only ways I know of that coding-priority-list changes.

 The file in question will be left in binary form.
 `convert-buffer-using-coding-system' is a convenience function to do
 the conversion at user request.  In Mike's case, use of 'euc-jp gives
 appropriate results.  You can do this at any time, even after editing
 the buffer, as long as you have not screwed up the external encoding
 (eg, by altering an escape sequence or changing or deleting one of the
 bytes in a multibyte character).

 -- 
 University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
 Institute of Policy and Planning Sciences       Tel/fax: +81 (298) 53-5091
 _________________  _________________  _________________  _________________
 What are those straight lines for?  "XEmacs rules." 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Re: crash when loading file which contains EUC-JP and ISO-2022-JP