update too slow from table with one record
Please consider the table bd.nn that contains roughly 19 million records:
> SELECT n, name, cdf from bd.nn limit 4;
+-------+-------------------------------+--------+
| n | name | cdf |
+-------+-------------------------------+--------+
| 10002 | JOJE | NULL |
| 10010 | AUGUSTINA 开发者_Python百科 | NULL |
| 10029 | CARDOS | NULL |
| 10037 | DE FRITOS | NULL |
+-------+-------------------------------+--------+
Primary key in n
with BTREE type.
The task is to UPDATE field cdf from the following table;
> SELECT * from n_cdf_temp;
+-------+--------+
| n | cdf |
+-------+--------+
| 10002 | 16 |
+-------+--------+
Primary key in n
with BTREE type.
Both n
fields still have char(9)
type although it's planned to move to int
type.
This table only has one record as an example but should have some million lines as well.
The commands I tried were, by this order:
UPDATE bd.nn y SET cdf = (SELECT cdf from temp.n_cdf_temp t WHERE t.n = y.n);
UPDATE bd.nn y SET cdf = (SELECT cdf from temp.n_cdf_temp t WHERE t.n = y.n) WHERE y.n in (SELECT n from temp.n_cdf_temp WHERE cdf IS NOT NULL);
UPDATE bd.nn y INNER JOIN temp.n_cdf_temp t ON y.n=t.n SET y.cdf = t.cdf;
UPDATE bd.nn y FORCE KEY (PRIMARY) INNER JOIN temp.n_cdf_temp t ON y.n=t.n SET y.cdf = t.cdf;
The problem is that this update takes a long time. In my development laptop (Core 2 Duo) with MyISAM tables, the update for the fourth command took 6,5 seconds and for the third about 30 seconds. If n_cdf_temp has 1000 records, the MyISAM engine takes 45 seconds.
But in the production server, with InnoDB tables, the update took 14 minutes when n_cdf_temp had only one record. When n_cdf_temp had 1000 records, the query took ca. 18 minutes. MySQL version 5.0.67 in old Linux box P4, 1GB RAM.
What else should I do to greatly improve the UPDATE performance for acceptable time?
ETA EXPLAIN for SELECT version of queries 3:
EXPLAIN SELECT y.n, t.cdf from bd.nn y INNER JOIN temp.n_cdf_temp t ON y.n=t.n\G
** 1. row id: 1 select_type: SIMPLE table: t type: index possible_keys: PRIMARY key: cdf key_len: 2 ref: NULL rows: 1 Extra: Using index
** 2. row id: 1 select_type: SIMPLE table: y type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 18744700 Extra: Range checked for each record (index map: 0x1) –
From the EXPLAIN you posted, it appears that it is refusing to use the index on your 'y' table. In row 2 of the EXPLAIN, note 'key: NULL key_len: NULL'. Is it possible that the datatypes for the primary keys of the 2 tables are different?
One way to solve this problem: make 2 requests.
One is: SELECT cdf from n_cdf_temp WHERE n = 'X';
Second: UPDATE bd.nn y SET cdf = 'result of first request' WHERE n = 'X';
Sure, the method is not ideal, but it very simple and it works.
Or you want update table for all values of 'X'?
精彩评论