开发者

Ranking search keywords

Question is: How to rank keywords that have been used in search queries in my web application based on time and number of search?

A user types his search query in the text box. Via AJAX I need to return some suggestions to the user. These suggestions are based on number of search done for that keyword, and should be sorted by most recently searched.

For example if a user enters the search term as "hang" the suggestions should be in this order: "hangover part 2", "hangover".

How should I design the database to store the search queries?How should I write the 开发者_JAVA百科sql query to get the suggestions?


For query suggestion a good way is to count the number of occurrences of each search query (it is probably better to not count repeated queries made by the same user). You'll have a file/table/something (query, count) like this:

"britney spears" 12
"kelly clarkson" 5
"billy joel" 27
"query abcdef" 2
"lady gaga" 39
...

Then you can sort by descending order of occurrence:

"lady gaga" 39
"billy joel" 27    
"britney spears" 12
"lady xyz" 5
"query abcdef" 2
...

Then when someone is searching "lady", for example, do a prefix search on all strings from the top of the file/table/something to the bottom. If you only want K suggestions you'll go only until you find the Top-K suggestions.

You could implement this using a simple file, or you can also have a counting query table and do a query similar to:

SELECT q.query from (SELECT * from search_queries order by query_count DESC) as q where q.query LIKE "prefix%" LIMIT 0,K

Two notes:

  1. There are better (and more difficult) ways of doing this. Amazon, for example, has a pretty nice query suggestion.
  2. The provided solution will only suggest queries that starts with the user query. Like:

    "lady" => ["lady gaga", "lady xyz"]

Query "lady" won't match "gaga lady". For them to match you will need query indexing, through the Full-Text Search support of your database or an external library such as Lucene.


Ideally, you'd sort on something like the following:

order by sum(# of searches / (how long ago that search was performed + 1))

This would have to be modified so that how long ago would be base on an appropriate base time. For example, if you want searches to count as half after a week, you'd make a week = 1.

This will clearly be inefficient, because calculating how long ago each search was performed for all search results will be time consuming. Thus, you might want to keep a running total for each search and multiply the totals by a certain value each time period. For example, if you want searches to count as half after a week, you would add one to that column for every search. Then, you would have a process that multiplies the search column by .5 every week. Then you just sort on that column.


Do you need something like autosuggestion? There is an JQuery plugin called autocomplete which only looks for similar words as soon as the user types in the letters. However, if you want to get the suggestions based on the number of times that keyword is searched by user, then you need to store the keywords in a separate table and then fetch it later for other user?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜