开发者

MYSQL select distinct(indexed_column) from table with 1 million rows

Edit: actually aside from select distinct(which I haven't verified yet), the main performance bottle neck might be the network speed, when server and client are both on localhost, select all 2 milion records took 36 seconds, however, on a (supposedly high speed) network with client sitting on another box, the query is not yet done after 10 minutes.

This is supposedly 100mbps network but when I checked the client(java jdbc), it's receiving data at a rate of 3kb/second. The mysql server, however, is sending at a speed o开发者_JS百科f 100kb/sec(including other client connections though). Why is the java jdbc client receiving data at such a low rate?

select distinct(indexed_column) from mytable

is very slow on mytable with only 1 million rows, the indexed_column is a non-unique index. is there a way to optimize it? an explain gives this:

id | select_type | table        | type  | possible_keys | key               | key_len | ref  | rows    | Extra     
|  1 | SIMPLE      | mytable | range | NULL          | my_index | 50      | NULL | 1759002 | Using index for group-by | 

does type=range means it's not using index? is this why it's slow?


I would build a unique index on the table, on the column you want "DISTINCT" of...

Hence your looking for DISTINCT on a given column. If you build a UNIQUE INDEX on a column(or columns) you are looking for distinct combinations, the indexed pages will only hold a pointer to the first record that qualifies for such combination.

Ex: if you have

Category   Count
1          587
2          321
3          172
4          229
5          837

Your UNIQUE INDEX on category will only have 5 records... in this case, even though over 2,000 entries across 5 categories, the DISTINCT CATEGORY count is 5, the index has 5, you're done. Apply this concept to your table of 1 MILLION + records.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜