Non-trivial geolocation query db caching
I first have to say that I really am a rookie in caching, so please do elaborate on any explanation and bear with me if my question is stupid.
I have a server with pretty limited resources, so I'm really interested in caching db-queries as effectively as I can. My issue is this:
I have a MYSQL DB with a table for geolocations, there are columns (lat and lng) - I only indexed lat since a query will always have both lat and lng, and only 1 index can be effectively used to my understanding (?).
The queries are very alternating in coordinates like
select lat, lng
where lat BETWEEN 123123123 AND 312412312 AND lng BETWEEN 235124231 AND 34123124
where the lo开发者_如何学Gong numbers that are the boundaries of the BETWEEN
query are constantly changing, so IS there a way to cache this the smart way, so that the cache doesn't have to be a complete query match, but the values of previous between queries can be held against a new to save some db resources?
I hope you get my question - if not please ask.
Thank you so much
Update 24/01/2011
Now that I've gotten some response I want to know what the most efficient way of querying would be.
- Would the Between query with int values execute faster or
- would the radius calculation with point values execute faster
if 1. then how would the optimal index look like?
If your table is MyISAM
you can use Point
datatype (see this answer for more details)
If you are not willing or are not able to use spatial indexes, you should two separate indexes:
CREATE INDEX ix_mytable_lat_lon ON mytable (lat, lon)
CREATE INDEX ix_mytable_lon_lat ON mytable (lon, lat)
In this case, MySQL
can use an index_intersect
over these indexes which is sometimes faster than mere filtering with a single index.
Even if it does not, it can pick a more selective index if there are two of those.
As for the caching, all pages read from the indexes are cached and reside in memory until they will be overwritten with hotter data (it not all database fits to the cache).
This will prevent MySQL
from the need to read the data from disk.
MySQL
is also able to cache the whole resultsets in memory, however, this requires the query to be repeated verbatim, with all parameters exactly the same.
I think to do significantly better you'll need to characterize your data better. If you've got data that's uniformly distributed across longitude and latitude, with no correlation, and if your queries are similarly distributed and independent - you're stuck. But if your data or your queries cluster in interesting ways, you may find that you can introduce new columns that make at least some queries quicker. If most queries happen within some hard range, maybe you can set that data aside - add a flag, link it to some other table, even put the frequently-requested data into its own table. Can you tell us any more about the data?
精彩评论