Delphi Developers Archive

- September 14, 2014

Did you know that the Jedi Code Library has got a Unicode aware search engine using Boyer-Moore? It's called TUTBMSearch and located in JclUnicode.pas

Comments

Asbjørn HeidSeptember 14, 2014 at 12:25 PM
Given that modern CPU architectures have really high memory bandwidth and really dislikes unpredicable branching, in which cases does BM and similar "clever" search algorithms still pay off? Anyone got any experiences to share?

edit: nice reminder BTW, I really should check out the entire JCL, so I know what I don't have to reinvent ;)
ReplyDelete
Replies
Daniela OsterhagenSeptember 14, 2014 at 9:51 PM
Hm, good question. There are two options to find out: Theoretical or practical, the later meaning write a test and let it run on many modern computers.
ReplyDelete
Replies
Daniela OsterhagenSeptember 14, 2014 at 9:54 PM
Btw. the JCL engine does not work very well for word search: Single letters are found, even if they are inside a word, and punctuation characters are treated as part of words rather than separators.
ReplyDelete
Replies
Asbjørn HeidSeptember 14, 2014 at 11:57 PM
Reason I ask is that when I wrote my own BM implementation some 10 years ago, it struggled to outpace Pos() in several cases, and CPUs haven't gotten more fond of branching since then.
ReplyDelete
Replies
Asbjørn HeidSeptember 16, 2014 at 2:14 PM
So it's been ages since I looked into this, decided to freshen up a little. If you use the modified version of BM (using Galil's rule) it's worst case is O(n) and not O(n*m). That's certainly good compared to the regular Pos(). I haven't had time to check out the implementation in JCL yet though to see if it uses this version or not.
ReplyDelete
Replies

Add comment

Search This Blog

Delphi Developers Archive

Comments

Post a Comment