SELECT clause using IN ... very slow?
Could you guys please review the following query to an Oracle DB and point out what's wrong:
SELECT t1.name FROM t1, t2 WHERE t1.id = t2.id AND t2.empno IN (1, 2, 3, …, 200)
Query statistics:
- Time taken: 10.53 seconds.
Indices:
t2.empno
is indexed.t1.id
is indexed.t2.id
is indexed.
Update
The above query was just a sample replica of the query i use. Here below in a more true form
Explain Plan
Query:
SELECT
PRODUCT_REPRESENTATION_SK
FROM
Product_Representation pr
, Design_Object do
, Files files
,EPS_STATUS epsStatus
,EPS_ERROR_CODES epsError
,VIEW_TYPE viewTable
WHERE
pr.DESIGN_OBJECT_SK = do.DESIGN_OBJECT_SK
AND pr.LAYER_NAME !='Layer 0'
AND epsStatus.EPS_STATUS_SK = pr.EPS_STATUS
AND epsError.EPS_ERROR_CODE = pr.EPS_ERROR_CODE
AND viewTable.VIEW_TYPE_ID = pr.VIEW_TYPE_ID
AND files.pim_id = do.PIM_ID
AND do.DESIGN_OBJECT_ID IN
(
147086,149924,140458,135068,145197,134774,141837,138568,141731,138772,143769,141739,149113,148809,141072,141732,143974,147076,143972,141078,141925,134643,139701,141729,147078,139120,137097,147072,138261,149700,149701,139127,147070,149702,136766,146829,135762,140155,148459,138061,138762............................................. 200 such numbers
)
Indexed Colums:
pr.DESIGN_OBJECT_SK
do.DESIGN_OBJECT_SK
do.DESIGN_OBJECT_ID
files.pim_id
Table
TABLE "PIM"."DESIGN_OBJECT"
(
"DESIGN_OBJECT_SK" NUMBER(*,0) NOT NULL ENABLE,
"PIM_ID" NUMBER(*,0) NOT NULL ENABLE,
"DESIGN_OBJECT_TYPE_SK" NUMBER(*,0) NOT NULL ENABLE,
"DESIGN_OBJECT_ID" VARCHAR2(40 BYTE) NOT NULL ENABLE,
"DIVISION_CD" NUMBER(*,0),
"STAT_IND" NUMBER(*,0) NOT NULL ENABLE,
"STAT_CHNG_TMST" TIMESTAMP (6),
"CRTD_BY" VARCHAR2(45 BYTE),
"CRT_TMST" TIMESTAMP (6),
"MDFD_BY" VARCHAR2(45 BYTE),
"CHNG_TMST" TIMESTAMP (6),
"UPDATE_CNT" NUMBER(*,0),
"GENDER" VARCHAR2(1 BYTE),
PRIMARY开发者_如何转开发 KEY ("DESIGN_OBJECT_SK")
)
TABLESPACE "PIM" ENABLE,
FOREIGN KEY ("DESIGN_OBJECT_TYPE_SK")
REFERENCES "PIM"."DESIGN_OBJECT_TYPE" ("DESIGN_OBJECT_TYPE_SK")
ON DELETE CASCADE ENABLE,
FOREIGN KEY ("PIM_ID")
REFERENCES "PIM"."FILES" ("PIM_ID")
ON DELETE CASCADE ENABLE
)
Table 2
CREATE TABLE "PIM"."PRODUCT_REPRESENTATION"
(
"PRODUCT_REPRESENTATION_SK" NUMBER(*,0) NOT NULL ENABLE,
"DESIGN_OBJECT_SK" NUMBER(*,0) NOT NULL ENABLE,
"VIEW_TYPE_ID" NUMBER(*,0) NOT NULL ENABLE,
"LAYER_NAME" VARCHAR2(255 BYTE),
"STAT_IND" NUMBER(*,0) NOT NULL ENABLE,
"STAT_CHNG_TMST" TIMESTAMP (6),
"CRTD_BY" VARCHAR2(45 BYTE),
"CRT_TMST" TIMESTAMP (6),
"MDFD_BY" VARCHAR2(45 BYTE),
"CHNG_TMST" TIMESTAMP (6),
"UPDATE_CNT" NUMBER(*,0),
"EPS_STATUS" VARCHAR2(30 BYTE) NOT NULL ENABLE,
"EPS_GENERATED_TIME" TIMESTAMP (6),
"EPS_ERROR_CODE" NUMBER,
"EPS_ERROR_DETAILS" VARCHAR2(500 BYTE),
"DEEPSERVER_ASSET_LAYER_ID" VARCHAR2(255 BYTE),
"PRODUCT_REPRESENTATION_LOC" VARCHAR2(255 BYTE),
PRIMARY KEY ("PRODUCT_REPRESENTATION_SK")
)
TABLESPACE "PIM" ENABLE,
FOREIGN KEY ("DESIGN_OBJECT_SK")
REFERENCES "PIM"."DESIGN_OBJECT" ("DESIGN_OBJECT_SK")
ON DELETE CASCADE ENABLE,
FOREIGN KEY ("VIEW_TYPE_ID")
REFERENCES "PIM"."VIEW_TYPE" ("VIEW_TYPE_ID")
ON DELETE CASCADE ENABLE,
CONSTRAINT "EPS_ERROR_CODE_FK"
FOREIGN KEY ("EPS_ERROR_CODE")
REFERENCES "PIM"."EPS_ERROR_CODES" ("EPS_ERROR_CODE")
ON DELETE CASCADE ENABLE,
CONSTRAINT "EPS_STATUS_FK"
FOREIGN KEY ("EPS_STATUS")
REFERENCES "PIM"."EPS_STATUS" ("EPS_STATUS_SK")
ON DELETE CASCADE ENABLE
)
Lets forget for a moment the empno BETWEEN 1 and 200
suggestion and assume that you have you have t2.empno IN (3,7,...,5209)
(200 entries).
You could also write your query (which is a hidden JOIN query) to the non-equivalent EXISTS query which would show same results (but possibly fewer rows) and should be faster than the JOIN:
SELECT
t1.name
FROM
t1
WHERE EXISTS
( SELECT *
FROM t2
WHERE t2.id = t1.id
AND t2.empno IN (3,7,...,5209)
)
(Wild speculation)
If on the other hand, it's not even t2.empno IN (3,7,...,5209)
but t2.empno IN (SELECT tx.empno FROM tx WHERE someConditions)
and you are using MySQL, then this is the root of your problem (MySQL is known to not handle field IN (SELECT f FROM x)
in the best possible way). So, you could change the query into:
SELECT
t1.name
FROM
t1
JOIN t2
ON t2.id = t2.id
JOIN tx
ON tx.empno = t2.empno
WHERE
someConditions
or even to:
SELECT
t1.name
FROM
t1
WHERE EXISTS
( SELECT *
FROM t2
JOIN tx
ON tx.empno = t2.empno
WHERE t2.id = t1.id
AND someConditions
)
The first thing that is wrong is using implict join syntax. That is a SQL antipattern.
If you have a large list in the IN clause, have you tried putting them in a table instead and using a join?
What database? Have you looked at your explain plan or execution plan to see where the slowdown is?
Don't use the cross-join.
try this
SELECT
t1.name
FROM
t1
JOIN t2
ON t2.id = t1.id
WHERE
t2.empno IN (1,...,200)
EDIT: After you edit, seeing your multiple tables in the cartesian products, it is probably very important that you use proper JOIN
syntax.
精彩评论