optimizing query
I have a Table foo which records the sightings of bird species. foo_id is its PK, other concerned columns are s_date, latitude and longitude. species_id is its FK. I have indexes on s_date, latitude and longitude, species_id. Table foo has 20 million records and increasing. The following query gives me top 10 latest species sightings in a given lat/long. The query is taking too much time (10+ mins sometimes). How to optimize it? I am using mysql.
SELECT species_id, max(s_date)
FROM foo
WHERE latitude >= minlat
AND latitude <= maxlat
AND longitude >= minlon
AND lon开发者_如何学运维gitude <= max lon
GROUP BY species_id
ORDER BY MAX(s_date) DESC LIMIT 0, 10;
I understand that you have separate indexes on the fields that you mention. You may want to try adding a composite index (aka multiple-column index) on (latitude, longitude)
:
CREATE INDEX ix_foo_lat_lng ON foo (latitude, longitude);
You may want to run an EXPLAIN
on your query to see what index(es) MySQL is using. Quoting from the MySQL Manual :: How MySQL Uses Indexes:
Suppose that you issue the following
SELECT
statement:mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
If a multiple-column index exists on
col1
andcol2
, the appropriate rows can be fetched directly. If separate single-column indexes exist oncol1
andcol2
, the optimizer will attempt to use the Index Merge optimization, or attempt to find the most restrictive index by deciding which index finds fewer rows and using that index to fetch the rows.
You may also be interested in checking out the following presentation:
- Geo/Spatial Search with MySQL1 by Alexander Rubin
The author describes how you can use the Haversine Formula in MySQL to order by proximity and limit your searches to a defined range. He also describes how to avoid a full table scan for such queries, using traditional indexes on the latitude and longitude columns.
1 PDF Version
精彩评论