Ruby - Performance of regex
I am trying to see if there is a better performing way of finding an exact match for a word in a string. I am looking for a word in my database table for a field 'title'. The number of records vary widely and the performance I am seeing is pretty scary.
Here are the 3 ways I benchmarked the results.
title.split.include(search_string)
/\b#{search_string }\b/ =~ title
title.include?(search_string)
The best performance is for title.include?(search_string)
which does not do an exact word search (and I am looking for an exact word search)
def do_benchmark(search_results, search_string)
n=1000
Benchmark.bm do |x|
x.report("\b word search \b:") {
n.times {
search_results.each {|search_result|
title = search_result.title
/\b#{search_string}\b/ =~ title
}
}
}
end
Benchmark.bm do |x|
search_string = search.search_string
x.report("split.include? search:") {
n.times {
search_results.each {|se开发者_C百科arch_result|
title = search_result.title
title.split.include?(search_string)
}
}
}
end
Benchmark.bm do |x|
search_string = search.search_string
x.report("string include? search:") {
n.times {
search_results.each {|search_result|
title = search_result.title
title.include?(search_string)
}
}
}
end
"processing: 6234 records"
"Looking for term: red ferrari"
user system total real
word search: 50.380000 2.600000 52.980000 ( 57.019927)
user system total real
split.include? search: 54.600000 0.260000 54.860000 ( 57.854837)
user system total real
string include? search: 21.600000 0.060000 21.660000 ( 21.949715)
Is there any way I can get better performance AND exact string match results?
You want full text search of a model field. This is best accomplished not by regex scans, but by a specialized index for full text retrieval. Rather than roll your own, I'd recommend using one of the following:
- acts_as_indexed
- Sphinx
- Ferret
- Xapian
- Lucene/Solr
Here's some links with some more detail on the options:
- http://locomotivation.squeejee.com/post/109284085/mulling-over-our-ruby-on-rails-full-text-search-options
- Full Text Searching with Rails
Do a split on whitespaces on your string, go through each word in the split string, then check against ==
operator.
精彩评论