Auto-suggest of airfields - how to order "best match first" across several columns?
Given the following table structure:
id iata icao name
1 ANC PANC Anchorage - Ted Stevens International
2 LHR EGLL London - Heathrow
3 LGW EGKK London - Gatwick
4 MDZ SAME Mendoza - Governor Francisco Gabrielli International
what's the best way to extract the records in "best match first" order, as you would expect in an auto-suggest field at a travel agent site? A match on ="string"
in any one column should take precedence over a match on LIKE "%string%"
in any other. After that, alphabetical order on "name" is probably what I want.
开发者_Python百科For example, searching "anc" would return row 1 then row 4. "lon" would get 3 then 2.
IATA codes are always 3 characters, ICAO always 4. However, either or both may be NULL.
While I would expect my AJAX to submit three characters and filter in Javascript from there, I can't assume that the input will be three characters - non-Javascript users will be submitting a string of arbitrary length.
I'm doing this in PHP with MySQL.
This is a tough nut to crack. Here are some factors I've used in the past, followed by their relative weight.
- Exact match: 50 (basically: 100% success)
- Levenshtein distance < 2: 25 (weeds out spelling mistakes - fairly intensive algo, though)
- Partial match at beginning of name: 15
- Partial match somewhere in the name: 10
A full match would get 100 points based on this (it matches all). I then simply sort by sum-of-weight.
For my dataset (song names) this yielded pretty good results. Not sure how it holds up with yours.
In your case, I'd add a "if input is 3 characters, then 50 points if it matches IATA, if 4 characters, 50 points if it matches ICAO" to the list.
How to do this in SQL:
SELECT SUM(
IF(`name` = :Input, 50, 0),
IF(`iata` = :Input, 50, 0),
IF(`icao` = :Input, 50, 0),
IF(`name` LIKE CONCAT('%', :Input, '%'), 25, 0),
...
) FROM Table;
Sorting was done in PHP.
Not beautiful or elegant, but it worked.
Here's a fairly simple approach:
SELECT ID, 1 As Certainty FROM airfields where iata = @searchValue OR icao = @searchValue OR name = @searchValue
UNION
SELECT ID, 2 FROM airfields where iata <> @searchValue AND icao <> @searchValue AND NAME <> @searchValue
AND (iata LIKE '%' + @searchValue + '%" OR icao LIKE '%' + @searchValue + '%" OR name LIKE '%' + @searchValue + '%")
ORDER BY Certainty
So, perfect matches always take precedence. You'd probably want to tighten up the ordering after that.
Use fulltext indexes if you are running MyISAM type tables. Match-against will give you relevancy using which you could order your results. http://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html#function_match
Use Apache Solr or Sphinx instead.
精彩评论