PHP: find 3-char words in query string to augment MySQL full-text search
I'm working on a simple MySQL full-text search feature on a CakePHP site, and noticed that M开发者_运维技巧ySQL strips short words (3 chars or less) out of the query. Some of the items in the site have 3 character titles, however, and I'd like to include them in the results. (I've ruled out using more robust search appliances like Solr due to budget constraints)
So I want to find any 3 character words in the query string, and do a quick lookup just on the title field. The easiest way I can think to do this is to explode()
the string and iterate over the resulting array with strlen()
to find words of 3 characters. Then I'll take those words and do a LIKE
search on the title field, just to make sure nothing that should obviously be in the results was missed.
Is there a better / easier way to approach this?
UPDATE: Yes, I know about the ft_min_word_len
setting in MySQL. I don't think I want to do this.
There is a system option named “ft_min_word_len” by which you can define the minimum length of words to be indexed. You can set the value of this configuration directive to a lower value (eg 2): it's found under the [mysqld] section in your MySQL configuration file. This file is typically found under “/etc/mysql” or “/etc”. In windows you can look under windows directory or MySQL home folder.
[mysqld]
ft_min_word_len=2
I'm going with my original idea for now, unless someone has a better approach not involving ft_min_word_len
. (If I could use this on a per-database level, I might consider it -- but otherwise it is too far-reaching.)
I have a function like this:
$query = str_replace(array(',', '.'), '', $query);
$terms = explode(' ', $query);
$short = '';
foreach($terms as $term){
if(strlen($term) == 3){
$short .= '"'.$term.'", ';
}
}
if(!empty($short)){
$short = trim($short, ', ');
}
return $short;
And then I use the returned string to search the title
column: WHERE title IN ($short)
, to supplement a full-text search. I arbitrarily assign a score of 3.5, so that the returned records can be sorted along with the other full-text search hits (I chose a relatively high score, since it is an exact match for the title
of the record).
This doesn't feel very elegant to me, but it resolves the problem.
精彩评论