On Thu, Nov 7, 2013 at 3:36 AM, Uwe Brauer <oub(a)mat.ucm.es> wrote:
Hello
I would like to have a lisp pkg which would replace certain words in a text.[1]
For this I need two things,
- a function which does the replacement and
- a list containing words with and without niqqud.
Concerning the first, I have already a function, which is loosely
based on iso-accentuate, or a function
(TeX-to-char) which was provided by Aidan
some years ago, which replaces latex symbols by its UTF8
equivalents. I am not sure which code is more efficient, I'll
will to post the central part of the code later.
However what bothers me more is the second part. I obtained the
hebrew bible in UTF8 format and could then generate the desired
list. However it seems to me that this list would be huge, at
least 2000 to 3000 words if not more.
What is a reasonable size limit for such a list???
Is 2000 words to big? or must I divide the list in several parts
(and files) and write corresponding functions?
Footnotes:
[1] to be precise, to substitute hebrew words by hebrew words with
vowels, so called niqqud
My first thought was to stuff your list into a hash table, something like this:
(defun replace-words-in-text (words replacements &optional buffer)
(let ((table (make-hash-table :test #'equal)))
(map nil #'(lambda (word replacement)
(puthash word replacement table))
words replacements)
(with-current-buffer (or buffer (current-buffer))
(save-excursion
(goto-char (point-min))
(skip-syntax-forward "-.")
(while (< (point) (point-max))
(let* ((oldpoint (point))
(word
(progn
(skip-syntax-forward "w_")
(buffer-substring-no-properties oldpoint (point))))
(replacement (gethash word table)))
(when replacement
(delete-region oldpoint (point))
(insert replacement))
(skip-syntax-forward "-.")))))))
But that probably doesn't perform very well. Anyway, the point is
that doing hash table lookups should be much more efficient than
iterating over a list. And the nice thing about a hash table is that
the size of your list doesn't matter (much).
--
Jerry James
http://www.jamezone.org/
_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://lists.xemacs.org/mailman/listinfo/xemacs-beta