[COMMIT] Fix Unicode conversion for control-1, prevent a segfault

Thursday, 31 March 2005

APPROVE COMMIT 

NOTE: This patch has been committed.

src/ChangeLog addition:

2005-03-31  Aidan Kehoe  <kehoea(a)parhasard.net&gt;

	* unicode.c (set_unicode_conversion): Don't try to write to the
	non-existent conversion tables for ASCII and control-1, in the
	interest of not segfaulting. 
	* unicode.c (unicode_convert): The "position code" for a control-1
	character has #xA0 added to it when encoded in Mule, unlike #x80
	for all the other non-ASCII character sets ; take this into
	account.

XEmacs Trunk source patch:
Diff command:   cvs -q diff -u
Files affected: src/unicode.c

Index: src/unicode.c
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/src/unicode.c,v
retrieving revision 1.26
diff -u -u -r1.26 unicode.c
--- src/unicode.c	2005/02/28 23:36:32	1.26
+++ src/unicode.c	2005/03/31 14:46:45
＠＠ -857,6 +857,9 ＠＠
   sledgehammer_check_unicode_tables (charset);
 #endif

+  if (EQ(charset, Vcharset_ascii) || EQ(charset, Vcharset_control_1))
+    return;
+
   /* First, the char -> unicode translation */

   if (XCHARSET_DIMENSION (charset) == 1)
＠＠ -1921,7 +1924,13 ＠＠
 	    {			/* Processing Non-ASCII character */
 	      char_boundary = 1;
 	      if (EQ (charset, Vcharset_control_1))
-		encode_unicode_char (Vcharset_control_1, c, 0, dst,
+		/* See:
+
+		   (Info-goto-node "(internals)Internal String Encoding")
+
+		   for the rationale behind subtracting #xa0 from the
+		   character's code. */
+		encode_unicode_char (Vcharset_control_1, c - 0xa0, 0, dst,
 				     type, little_endian);
 	      else
 		{

-- 
“I, for instance, am gung-ho about open source because my family is being
held hostage in Rob Malda’s basement. But who fact-checks me, or Enderle,
when we say something in public? No-one!” -- Danny O’Brien

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003