mysql select between two columns works too slowly
I have this query:
SELECT `country`
FROM `geoip_base`
WHERE 1840344811 BETWEEN `start` AND `stop`
It's badly use index (use, but parse big part of table) and work too slowly. I tried use ORDER BY and LIMIT, but it hasn't helped.
"start <= 1840344811 AND 1840344811 <= stop" works similar.
CREATE TABLE IF NOT EXISTS `geoip_base` (
`start` decimal(10,0) NOT NULL,
`stop` decimal(10,0) NOT NULL,
`inetnum` char(33) collate utf8_bin NOT NULL,
`country` char(2) collate utf8_bin NOT NULL,
`city_id` int(11) NOT NULL,
PRIMARY KEY (`start`,`stop`),
UNIQUE KEY `start` (`start`),开发者_StackOverflow中文版
UNIQUE KEY `stop` (`stop`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
Table have 57,424 rows.
Explain for query "... BETWEEN START AND STOP ORDER BY START LIMIT 1":
using key stop
and get 24099 rows.
Without order and limit, mysql doesn't use keys and gets all rows.
If your table is MyISAM
, you can improve this query using SPATIAL
indexes:
ALTER TABLE
geoip_base
ADD ip_range LineString;
UPDATE geoip_base
SET ip_range =
LineString
(
Point(-1, `start`),
Point(1, `stop`)
);
ALTER TABLE
geoip_base
MODIFY ip_range NOT NULL;
CREATE SPATIAL INDEX
sx_geoip_range ON geoip_base (ip_range);
SELECT country
FROM geoip_base
WHERE MBRContains(ip_range, Point(0, 1840344811)
This article may be of interest to you:
- Banning IP's
Alternatively, if your ranges do not intersect (and from the nature of the database I except they don't), you can create a UNIQUE
index on geoip_base.start
and use this query:
SELECT *
FROM geoip_base
WHERE 1840344811 BETWEEN `start` AND `stop`
ORDER BY
`start` DESC
LIMIT 1;
Note the ORDER BY
and LIMIT
conditions, they are important.
This query is similar to this:
SELECT *
FROM geoip_base
WHERE `start` <= 1840344811
AND `stop` >= 1840344811
ORDER BY
`start` DESC
LIMIT 1;
Using ORDER BY / LIMIT
makes the query to choose descending index scan on start
which will stop on the first match (i. e. on the range with the start
closest to the IP
you enter). The additional filter on stop will just check whether the range contains this IP
.
Since your ranges do not intersect, either this range or no range at all will contain the IP
you're after.
While Quassnoi's answer https://stackoverflow.com/a/5744860/1095353 is perfectly fine. The MySQL function (5.7) MBRContains(g1,g2) does not suit the full range of the IP's when using the select. MBRContains will contain [g1,g2[ not including the g2.
Using MBRTouches(g1,g2) allows for both [g1,g2] to be matched. Having IP blocks written inside the database as start and, stop columns would make this function more viable.
On a database table with ~6m rows (AWS db.m4.xlarge)
SELECT *, AsWKT(`ip_range`) AS `ip_range`
FROM `geoip_base` where `start` <= 1046519788 AND `stop` >= 1046519788;
~ 2-5 seconds
SELECT *, AsWKT(`ip_range`) AS `ip_range`
FROM `geoip_base` where MBRTouches(`ip_range`, Point(0, INET_ATON('XX.XX.XX.XX')));
~ < 0.030 seconds
Source: MBRTouches(g1,g2) - https://dev.mysql.com/doc/refman/5.7/en/spatial-relation-functions-mbr.html#function_mbrtouches
Your table design is off.
You're using decimal but not allowing any zeroes. You immediately spend 5 bytes for storing such a number and simple INT would suffice (4 bytes).
After that, you create compound primary key (5 + 5 bytes) followed by 2 unique constraints (again 5 byte each) effectively making your index file almost the same size as the data file. That way, no matter what you index is extremely ineffective.
Using LIMIT doesn't force MySQL to use indexes, at least not the way you constructed your query. What will happen is that MySQL will obtain the dataset satisfying the condition and then discard the rows that don't conform to offset - limit.
Also, using MySQL's protected keywords (such as START and STOP) is a bad idea, you should never name your columns using protected keywords.
What would be useful is that you create your primary key as it is and don't index the columns separately. Also, configuring MySQL to use more memory would speed up execution.
For testing purposes I created a table similar to yours, I defined a compound key of start
and stop
and used the following query:
SELECT `country` FROM table WHERE 1500 BETWEEN `start` AND `stop` AND start >= 1500
My table is InnoDB type, I have 100k rows inserted, the query examines 87 rows this way and executes in a few milliseconds, my buffer pool size is 90% of the memory at my test machine. That might give insight into optimizing your query / db instance.
SELECT id FROM GEODATA WHERE start_ip <=(select INET_ATON('113.0.1.63')) AND end_ip >=(select INET_ATON('113.0.1.63')) ORDER BY start_ip DESC LIMIT 1;
The above example from Michael J.V. will not work:
SELECT country
FROM table WHERE 1500 BETWEEN start
AND stop
AND start >= 1500
BETWEEN start AND stop is the same as start <= 1500 AND end >= 1500
Thus you have start <= 1500 AND start >= 1500 in the same clause. So, only way it will succeed is if start=1500 and therefore the optimizer knows to use the start index.
精彩评论