Querying in 2 million + rows: response times and optimization? (PHP Codeigniter)
EDIT: When I remove the index on the column that I'm searching in, the response time goes from around 2.5 to around 2.6 sec. Should the difference not be a hell of lot more?
I'm writing a small application that uses a jquery autosuggest plugin. The user starts typing the name of his home town and my code queries a table of 2.7 million rows containing city names (maxmind city database) with AJAX. I set an index on the db column being searched (city_name).
When I run the following query (using CodeIgniter Active Record) CI's benchmark class reports 2.1922 sec (for that query only, so this does not include html page load times).
SELECT * FROM cities WHERE city_name LIKE "%bang%"
When I run the same query in phpmyadmin I get: Query took 0.0068 sec
When I run
SELECT * FROM cities WHERE city_name LIKE "%bangkok%"
CI's benchmark class reports 2.1951 sec. When I run this same query in phpmyadmin I get: Query took 2.1811 sec.
So while CI's response times are almost identical for both queries, the ones run in phpmyadmin are vastly different.
Questions
Obviously querying 2.7 million rows will be relatively slow, but is there any way I get can decent response times for such an auto suggest functionality? 2.5 sec is way too slow. AFAIK there 开发者_运维知识库is no fast way to search with a "LIKE" syntax in a PHP array containing all 2.7 million rows. Would memcache be an option?
Why are the response time differences negligible when using CI Active Record, but so vastly different in phpmyadmin?
You can get significant benefit (at least 100 times faster) by indexing the column you are searching for.
Take a look at 7.3.1. How MySQL Uses Indexes.
EDIT
Either phpmyadmin and your benchmark tool are running on the same machine? Phpmyadmin is supposed to run on the same machine than the database, but the benchmark tool could not. In the time measuring, the benchmark tool might probably take into account the network time overload (not the html rendering page).
Most likely what you want is a trie like database, or a kart-trie. It's a dictionary data structure and you can load all the cities in your memory. But you can also make a database from it. Then you want to reduce the complexity of the trie to a nested-set. A kart-trie differs from a radix-trie or patricia-trie that it has maximum 2 leafs per node that is then very easy to reduce to a nested-set.
Query cache is enabled by default; you need to add SQL_NO_CACHE to your query to bypass:
SELECT SQL_NO_CACHE * FROM TABLE...
More here: MySQL - force not to use cache for testing speed of query
You can create index on the field on which you are applying where condition. But indexing will not work for wildcard search for '%abc%' because in this case mysql need to traverse through each record to match with the criteria for best results you can use '%abc' or 'abc%' only to use the index.
Also you can describe/explain the query to analyze the query and see if query is using indexing or not.
For eg:
explain SELECT * FROM cities WHERE city_name LIKE "%bangkok%"
精彩评论