mysql query optimization problem
I am trying to import data from a large database. I have two tables with couple of hundred thousands records. I have to search data in two tables then insert that record in a new table (3rd table), if the record already exists in 3rd table then I have to updat开发者_Python百科e one column record in 3rd table.
This sounds easy, but it is taking so long to process.
Below are sample queries and pseudo code:
select * from table1 INNER JOIN table2 USING(id)
search in table 3 ->
if record exist{
update record in table 3 (update counter in a column)
}else{
Insert new record in table 3
}
First and second table has more than two hundred thousands records. As I start inserting record in 3rd table it kills the whole speed because then it also have to search in 3rd table to update or insert a record.
Database Name = MySql
Language = Php
What is the problem? How can I improve this? I can not wait hours to process it :(
Thanks
EDIT:
In table 3, id has primary key and all other columns are normal. Database schema is too big plus complex. Do you guys want , I paste hundred of lines here?
Can you please guys point out mistake in my pseudo code and query? What index or structure I can use to improve performance??
Structure
Table 1 - usr_id, first name, last name (usr_id is primary key)
Table 2 - id, usr_id, amount (id is primary key and usr_id is foreign key)
Table 3 - new_id , first name, last name, usr_id, total_amount (new_id is primary key and usr_id is foreign key
)
I check if table 3 has same first name and last name then update total amount, if they are different then insert a new record
From your pseudo code I see that you check record existence in Table 3 for each record in select * from table1 INNER JOIN table2 USING(id)
. You eventually end up with thousands of select
commands that degrade performance. You can improve it by
select t1.*, t2.*, t3.pk_field from table1 t1 INNER JOIN table2 t2 USING(id)
LEFT JOIN table3 t3 ON (join codition)
Now you can just test if t3.pk_field
is null (insert if it's null, update otherwise). The next step to improve it is to write one query that takes care of everything , using INSERT ON DUPLICATE KEY UPDATE
as proposed by jasonbar:
INSERT INTO table3(col1, col2,...)
select t1.col1, t1.col2, t2.col3, ....
from table1 t1 INNER JOIN table2 t2 USING(id)
ON DUPLICATE KEY UPDATE counter = new_value
You should look into the INSERT ON DUPLICATE KEY UPDATE syntax.
Something like:
INSERT INTO `table3` (`col1`, `col2`) VALUES('val1', 'val2')
ON DUPLICATE KEY UPDATE `counter` = `counter` + 1;
If a simple select query to see if the row already exists in table3 is taking too long, you probably aren't using indexes (or at least not using them correctly). We would need a fair bit more information to try and troubleshoot that.
精彩评论