开发者

Auto-suggest of airfields - how to order "best match first" across several columns?

Given the following table structure:

id  iata  icao  name
 1  ANC   PANC  Anchorage - Ted Stevens International
 2  LHR   EGLL  London - Heathrow
 3  LGW   EGKK  London - Gatwick
 4  MDZ   SAME  Mendoza - Governor Francisco Gabrielli International

what's the best way to extract the records in "best match first" order, as you would expect in an auto-suggest field at a travel agent site? A match on ="string" in any one column should take precedence over a match on LIKE "%string%" in any other. After that, alphabetical order on "name" is probably what I want.

开发者_Python百科For example, searching "anc" would return row 1 then row 4. "lon" would get 3 then 2.

IATA codes are always 3 characters, ICAO always 4. However, either or both may be NULL.

While I would expect my AJAX to submit three characters and filter in Javascript from there, I can't assume that the input will be three characters - non-Javascript users will be submitting a string of arbitrary length.

I'm doing this in PHP with MySQL.


This is a tough nut to crack. Here are some factors I've used in the past, followed by their relative weight.

  1. Exact match: 50 (basically: 100% success)
  2. Levenshtein distance < 2: 25 (weeds out spelling mistakes - fairly intensive algo, though)
  3. Partial match at beginning of name: 15
  4. Partial match somewhere in the name: 10

A full match would get 100 points based on this (it matches all). I then simply sort by sum-of-weight.

For my dataset (song names) this yielded pretty good results. Not sure how it holds up with yours.

In your case, I'd add a "if input is 3 characters, then 50 points if it matches IATA, if 4 characters, 50 points if it matches ICAO" to the list.

How to do this in SQL:

SELECT SUM(
    IF(`name` = :Input, 50, 0),
    IF(`iata` = :Input, 50, 0),
    IF(`icao` = :Input, 50, 0),
    IF(`name` LIKE CONCAT('%', :Input, '%'), 25, 0),
    ...
) FROM Table;

Sorting was done in PHP.

Not beautiful or elegant, but it worked.


Here's a fairly simple approach:

SELECT ID, 1 As Certainty FROM airfields where iata = @searchValue OR icao = @searchValue OR name = @searchValue
UNION
SELECT ID, 2 FROM airfields where iata <> @searchValue AND icao <> @searchValue AND NAME <> @searchValue
AND (iata LIKE '%' + @searchValue + '%" OR icao LIKE '%' + @searchValue + '%" OR name LIKE '%' + @searchValue + '%")
ORDER BY Certainty

So, perfect matches always take precedence. You'd probably want to tighten up the ordering after that.


Use fulltext indexes if you are running MyISAM type tables. Match-against will give you relevancy using which you could order your results. http://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html#function_match

Use Apache Solr or Sphinx instead.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜