unicode-internal-commit: Don't compare the same octet with itself if checking for boyer_moore_ok

Aidan Kehoe kehoea at parhasard.net
Thu Feb 4 04:29:34 EST 2010


changeset:   5031:e91e3e353805
user:        Aidan Kehoe <kehoea at parhasard.net>
date:        Sun Jan 31 18:09:57 2010 +0000
files:       src/ChangeLog src/search.c tests/ChangeLog tests/automated/case-tests.el tests/automated/search-tests.el
description:
Don't compare the same octet with itself if checking for boyer_moore_ok

src/ChangeLog addition:

2010-01-31  Aidan Kehoe  <kehoea at parhasard.net>

	* search.c (search_buffer):
	When checking the octets of c for identity, don't compare the
	same octet with itself. Thank you Ben Wing!

tests/ChangeLog addition:

2010-01-31  Aidan Kehoe  <kehoea at parhasard.net>

	* automated/search-tests.el:
	Check that multidimensional characters with repeated octets and
	case information force simple_search(), since boyer_moore()
	doesn't understand them when constructing its stride table.
	In passing; correct my spelling.
	* automated/case-tests.el (uni-mappings):
	In passing; delete a couple of redundant tests, correct the logic
	of another.


diff -r 70089046adef -r e91e3e353805 src/ChangeLog
--- a/src/ChangeLog	Sat Jan 30 20:34:23 2010 -0600
+++ b/src/ChangeLog	Sun Jan 31 18:09:57 2010 +0000
@@ -1,3 +1,9 @@
+2010-01-31  Aidan Kehoe  <kehoea at parhasard.net>
+
+	* search.c (search_buffer): 
+	When checking the octets of c for identity, don't compare the
+	same octet with itself. Thank you Ben Wing!
+
 2010-01-30  Ben Wing  <ben at xemacs.org>
 
 	* intl-auto-encap-win32.c:
diff -r 70089046adef -r e91e3e353805 src/search.c
--- a/src/search.c	Sat Jan 30 20:34:23 2010 -0600
+++ b/src/search.c	Sun Jan 31 18:09:57 2010 +0000
@@ -1441,7 +1441,7 @@
 		      int i, j;
 		      for (i = 0; i < len && boyer_moore_ok; ++i)
 			{
-			  for (j = 0; i < len && boyer_moore_ok; ++j)
+			  for (j = i + 1; j < len && boyer_moore_ok; ++j)
 			    {
 			      if (encoded[i] == encoded[j])
 				{
diff -r 70089046adef -r e91e3e353805 tests/ChangeLog
--- a/tests/ChangeLog	Sat Jan 30 20:34:23 2010 -0600
+++ b/tests/ChangeLog	Sun Jan 31 18:09:57 2010 +0000
@@ -1,3 +1,14 @@
+2010-01-31  Aidan Kehoe  <kehoea at parhasard.net>
+
+	* automated/search-tests.el:
+	Check that multidimensional characters with repeated octets and
+	case information force simple_search(), since boyer_moore()
+	doesn't understand them when constructing its stride table.
+	In passing; correct my spelling.
+	* automated/case-tests.el (uni-mappings):
+	In passing; delete a couple of redundant tests, correct the logic
+	of another.
+
 2010-01-30  Ben Wing  <ben at xemacs.org>
 
 	* automated/search-tests.el:
diff -r 70089046adef -r e91e3e353805 tests/automated/case-tests.el
--- a/tests/automated/case-tests.el	Sat Jan 30 20:34:23 2010 -0600
+++ b/tests/automated/case-tests.el	Sun Jan 31 18:09:57 2010 +0000
@@ -1466,9 +1466,7 @@
     (Assert-equalp lower upper)
     (Assert-equalp lowerupper upperlower)
     (Assert-equal lower (downcase upper))
-    (Assert-equal upper (downcase lower))
-    (Assert-equal lower (downcase upper))
-    (Assert-equal upper (downcase lower))
+    (Assert-equal upper (upcase lower))
     (Assert-equal (downcase lower) (downcase (downcase lower)))
     (Assert-equal (upcase lowerupper) (upcase upperlower))
     (Assert-equal (downcase lowerupper) (downcase upperlower))
diff -r 70089046adef -r e91e3e353805 tests/automated/search-tests.el
--- a/tests/automated/search-tests.el	Sat Jan 30 20:34:23 2010 -0600
+++ b/tests/automated/search-tests.el	Sun Jan 31 18:09:57 2010 +0000
@@ -192,22 +192,23 @@
  (boundp 'debug-xemacs-searches) ; normal when we have DEBUG_XEMACS
  "not a DEBUG_XEMACS build"
  "checks that the algorithm chosen by #'search-forward is relatively sane"
- (let ((debug-xemacs-searches 1))
+ (let ((debug-xemacs-searches 1)
+       newcase)
    (with-temp-buffer
      ;;#### Ben thinks this is unnecessary.  with-temp-buffer creates
      ;;a new buffer, which automatically inherits the standard case table.
      ;;(set-case-table pristine-case-table)
-     (insert "\n\nDer beruhmte deutsche Fleiss\n\n")
+     (insert "\n\nDer beruehmte deutsche Fleiss\n\n")
      (goto-char (point-min))
      (Assert (search-forward "Fleiss"))
      (delete-region (point-min) (point-max))
-     (insert "\n\nDer beruhmte deutsche Flei\xdf\n\n")
+     (insert "\n\nDer ber\xfchmte deutsche Flei\xdf\n\n")
      (goto-char (point-min))
      (Assert (search-forward "Flei\xdf"))
      (Assert-eq 'boyer-moore search-algorithm-used)
      (delete-region (point-min) (point-max))
      (when (featurep 'mule)
-       (insert "\n\nDer beruhmte deutsche Flei\xdf\n\n")
+       (insert "\n\nDer ber\xfchmte deutsche Flei\xdf\n\n")
        (goto-char (point-min))
        (Assert 
         (search-forward (format "Fle%c\xdf"
@@ -220,8 +221,33 @@
        (goto-char (point-min))
        (Assert (search-forward (format "Fle%c\xdf"
                                        (make-char 'latin-iso8859-9 #xfd))))
-       (Assert-eq 'simple-search search-algorithm-used)))))
-
+       (Assert-eq 'simple-search search-algorithm-used)
+       (setq newcase (copy-case-table (standard-case-table)))
+       (put-case-table-pair (make-char 'ethiopic #x23 #x23)
+			    (make-char 'ethiopic #x23 #x25)
+			    newcase)
+       (with-case-table
+	 ;; Check that when a multidimensional character has case and two
+	 ;; repeating octets, searches involving it in the search pattern
+	 ;; use simple-search; otherwise boyer_moore() gets confused in the
+	 ;; construction of the stride table.
+	 newcase
+	 (delete-region (point-min) (point-max))
+	 (insert ?0)
+	 (insert (make-char 'ethiopic #x23 #x23))
+	 (insert ?1)
+	 (goto-char (point-min))
+	 (Assert-eql (search-forward
+		      (string (make-char 'ethiopic #x23 #x25))
+		      nil t)
+		     3)
+	 (Assert-eq 'simple-search search-algorithm-used)
+	 (goto-char (point-min))
+	 (Assert-eql (search-forward
+		      (string (make-char 'ethiopic #x23 #x27))
+		      nil t)
+		     nil)
+	 (Assert-eq 'boyer-moore search-algorithm-used))))))
 
 ;; XEmacs bug of long standing.
 



More information about the XEmacs-Patches mailing list