Hashing n-grams by cyclic polynomials - java implementation
I'm solving some problem that involves Rabin–Karp string search algorithm. This algorithm requires rolling hash to be faster then naive search. This article describes how to implement rolling hash. I implemented "Rabin-Karp rolling hash" without problems and found few implementations implementations, but article also mentions computational complexity and that hashing n-grams by cyclic polynomials is prefered. It links to BuzHash imp开发者_如何学Clementation of such technique but I wonder how it can be used to build n-gram hash on top of it. I want to have something like this or
CPHash cp = new CPHash("efghijk");
cp.shiftRight('l') // now we got hash of "fghijki"
cp.shiftLeft('d') // "defghi"
for java.
For people who will encounter problems related with string search (like me) there are some articles that I found usefull: 1, 2, 3
I recently published an Apache licensed Java library which implements several rolling hash functions including Cyclic and Rabin-Karp:
http://code.google.com/p/rollinghashjava/
https://github.com/lemire/rollinghashjava
精彩评论