Re: large el files and performance: replace words from lists

Monday, 11 November 2013

 Ar an t-aonú lá déag de mí na Samhain, scríobh Uwe Brauer: 

...
    > The interesting question about this approach is how big this
regexp will be
    > and how slow searching with it will be. It will be cached, on the upside.
    > It’s something you’d need to experiment with; maybe creating several regexps
    > with regexp-opt is the way to go.

 Oops I have not finished the list, but it might be larger than I
 thought, 4 Megas maybe. Regexp-opt can deal with that? 
There’s nothing to prevent it. It’ll be slow, but it only has to run the
first time.

...
    > -- Save the map from the words without niqqud to the words
with niqqud in a
    > Berkeley or DBM database; see #'open-database, #'put-database, and the
code
    > that uses them in descr-text.el in XEmacs 21.5. Be careful about the CODESYS
    > argument to #'open-database.

 And that should be faster as a hashtable? 
A little slower--not much, though, the main overhead is from the Emacs Lisp
funcall, and the relevant databases are actively developed, I’m sure their
performance is acceptable.

...
 Since modern hebrew is written without niqqud, there are difference
in
 spelling in order to avoid such uncertainties

 So for example if in haaretz,  say David Kastrup is mentioned, his name 
 would be written (using logical hebrew here)

 דויד.

 Since I want such function for modern hebrew[1], I am not much worried
 about this  problem. 
Good!

-- 
‘Liston operated so fast that he once accidentally amputated an assistant’s
fingers along with a patient’s leg, […] The patient and the assistant both
died of sepsis, and a spectator reportedly died of shock, resulting in the
only known procedure with a 300% mortality.’ (Atul Gawande, NEJM, 2012)

_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://lists.xemacs.org/mailman/listinfo/xemacs-beta

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998