Slow mysql query / Possible query problem

2023-03-05 01:10 问答作者：

I have 2 (similar) queries:

-- Query #1 - get all new products not in currently in the Product table
-- Should match any products in the temp table that do not exist in the Product table
INSERT
  INTO `tmpProductState` (`ProductId`, `ChangedOn`, `State`)
SELECT t.`ProductId`, t.`ProcessedOn`, \'Activated\'
  FROM `tmpImport` t
  LEFT JOIN `Product` p USING (`ProductId`)
 WHERE p.`ProductId` IS NULL
    ON DUPLICATE KEY UPDATE
       `State` = VALUES(`State`)

-- Query #2 - get all Products that are removed from the Product table
-- Should match any products in the Product table that do not exist in the temp table
INSERT
  INTO `tmpProductState` (`ProductId`, `ChangedOn`, `State`)
SELECT p.`ProductId`, p.`LastSeenDate`, \'Deactivated\'
  FROM `Product` p
  LEFT JOIN `tmpImport` t USING (`ProductId`)
 WHERE t.`ProductId` IS NULL
    ON DUPLICATE KEY UPDATE
       `State` = VALUES(`State`)

On first run, (day 1), when the Product table is empty, both queries run in under 1 second, however, on second run, (day 2), when the Product table has 14000 records, the first query runs in under 2 seconds, and the second query runs in 244 seconds. Each consecutive data import is the same (240-250 seconds for query # 2. On checking the database, all data seems correct, I just can't figure out why the second query is taking so long.

---> Edit: Slow Query log:

# Query_time: 245.328784  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 187711973
SET timestamp=1305151558;

INSERT
  INTO `tmpProductState` (`ProductId`, `ChangedOn`, `State`)
SELECT p.`ProductId`, p.`LastSeenDate`, 'Deactivated'
  FROM `Product` p
  LEFT JOIN `tmpImport` t USING (`ProductId`)
 WHERE t.`ProductId` IS NULL
    ON DUPLICATE KEY UPDATE
       `State` = VALUES(`State`);

What concerns me most at this point: Rows_examined: 187711973 (how exactly is it examining that many rows?) The Product table size is ~14000 records, and the import table size is ~28000 records, and the tmpProductState is at most 60 records.

---> Another Edit:

EXPLAIN SELECT p.`ProductId` , p.`LastSeenDate` , 'Deactivated'
FROM `Product` p
LEFT JOIN `tmpImport` t
USING ( `ProductId` )
WHERE t.`ProductId` IS NULL 

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE          p       ALL     NULL            NULL    NULL            NULL    14151
1   SIMPLE          t       index   NULL            PRIMARY 100             NULL    28166   Using where; Using index; Not exists

Tables Involved:

CREATE TABLE IF NOT EXISTS `tmpImport` (
  `CategoryId`             smallint(5) unsigned NOT NULL,
  `ProcessedOn`            date DEFAULT NULL,
  `ProductId`              varchar(32) NOT NULL,
  `Title`        开发者_如何学运维          varchar(255) DEFAULT NULL,
  `Description`            text,
  `ActivateDate`           date DEFAULT NULL,
  PRIMARY KEY (`CategoryId`,`ProductId`)
) ENGINE=MyISAM DEFAULT CHARSET = UTF8

CREATE TABLE IF NOT EXISTS `tmpProductState` (
  `ProductId` VARCHAR(32) NOT NULL,
  `ChangedOn` DATE NOT NULL,
  `State` ENUM(\'Activated\',\'Deactivated\'),
  PRIMARY KEY(`ProductId`,`ChangedOn`)
) ENGINE = Memory

CREATE TABLE `Product` (
  `ProductId` varchar(32) NOT NULL,
  `Title` varchar(255) DEFAULT NULL,
  `Description` text,
  `ActivateDate` date DEFAULT NULL,
  `LastSeenDate` date DEFAULT NULL,
  PRIMARY KEY (`ProductId`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

Your tables are not normalized, you don't have effective indexes, and your joins are... unusual.

I assume you can't do anything about the gross duplication of data across tables, so lets ignore that.

Looks like you are duplicating data across columns in the tables you are joining, you should use all of those columns in your join. So it should probably be:

LEFT JOIN `tmpImport` t USING (`ProductId`, `Title`, `Description`, `ActivateDate`)

Add indexes to your tables that correspond to the fields you are joining or filtering on. DO NOT have a compound key as your primary key. Instead add an auto incrementing field to be the PK, and have a unique key if you need to enforce uniqueness. Both the product table and tmpImport should have keys for each column being joined.

Hope some of these ideas help you out.

Very late reply to this, but your first query is getting all the records from tmpImport and they getting all the records from Product using the primary key on the Product table. This is pretty efficient. The second query is getting all the records from Product and then getting the matching records from tmpImport but without the benefit of any index on ProductId on tmpImport. Hence runs badly.

Add an index on ProductId to the tmpImport table (the ProductId in the primary key is ignored for this join as it is not the first column in the key and you are not using CategoryId which is the first ).

继续阅读：performance

Slow mysql query / Possible query problem

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？