Problem with predicte pushing in a Max and Min statistic about each user in a group
I have two table开发者_JS百科s in an ORACLE 10g DB:
The table USERS with columns: UserId | GroupId | Other details where there's 1 record for each User, total 400, and there c.a. 1-10 users in each Group.
and the table USAGE with columns: UserId | Date | Amount where there's 10000 rows for each user. (so 400*10000 rows)
I have the Users table indexed on UserId, GroupId and (UserId,GroupId) and the table Usage is indexed on (UserId,Date)
I want to see the Max and Total amount of usage for each user in a given group. Something like this:
select User.UserId, maxAmount, Total from
Users
JOIN
( select UserId,max(Amount) as maxAmount from USAGE group by UserId ) A
ON User.UserId = A.UserId
JOIN
( select UserId,min(Amount) as minAmount from USAGE group by UserId ) B
ON User.UserId = B.UserId
WHERE User.GroupId = 'some_group_id'
But it is very slow (approx 20 sec), but when I do them separately (just Max and just Min, but not both at once), like this:
select User.UserId, maxAmount, minAmount from
Users
JOIN
( select UserId,max(Amount) as maxAmount from USAGE group by UserId ) A
ON User.UserId = A.UserId
WHERE User.GroupId = 'some_group_id'
it runs in a flash!
It just makes no sense that I can join them separately very fast, but when I join them all three it's very slow.
Here are the plans, where I created the inline views in the preceeding statements as actual views:
create view usage_min as select user_id, min(amount) from usage group by user_id;
create view usage_max as select user_id, max(amount) from usage group by user_id;
explain plan for select * from usage_min join users using(user_id) where group_id='1212882339';
select * from table(dbms_xplan.display);
Plan hash value: 3874246446
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 139 | 1162 (1)| 00:00:14 |
| 1 | HASH GROUP BY | | 1 | 139 | 1162 (1)| 00:00:14 |
| 2 | TABLE ACCESS BY INDEX ROWID | USAGE | 7977 | 116K| 1157 (1)| 00:00:14 |
| 3 | NESTED LOOPS | | 11085 | 1504K| 1160 (1)| 00:00:14 |
| 4 | MAT_VIEW ACCESS BY INDEX ROWID| USERS | 1 | 124 | 3 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | USERS_KT | 1 | | 1 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | SYS_C0099818 | 7977 | | 79 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("USERS"."group_id"='1212882339')
6 - access("user_id"="USERS"."user_id")
explain plan for select * from users join usage_max using(user_id) where group_id='1212882339';
select * from table(dbms_xplan.display);
Plan hash value: 2384977958
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 145 | 519 (2)| 00:00:07 |
| 1 | HASH GROUP BY | | 1 | 145 | 519 (2)| 00:00:07 |
| 2 | TABLE ACCESS BY INDEX ROWID | USAGE2 | 18251 | 374K| 512 (1)| 00:00:07 |
| 3 | NESTED LOOPS | | 25362 | 3591K| 515 (1)| 00:00:07 |
| 4 | MAT_VIEW ACCESS BY INDEX ROWID| USERS | 1 | 124 | 3 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | USERS_KT | 1 | | 1 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | T2_user_id | 18251 | | 25 (4)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("USERS"."group_id"='1212882339')
6 - access("USERS"."user_id"="user_id")
explain plan for select * from users join usage_max using(user_id) join usage_min using(user_id) where group_id='1212882339';
select * from table(dbms_xplan.display);
Plan hash value: 3190011991
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 16600 | 6782 (19)| 00:01:22 |
|* 1 | HASH JOIN | | 100 | 16600 | 6782 (19)| 00:01:22 |
| 2 | MERGE JOIN | | 100 | 14500 | 3176 (19)| 00:00:39 |
| 3 | SORT JOIN | | 2920K| 58M| 3172 (19)| 00:00:39 |
| 4 | VIEW | USAGE_MAX | 2920K| 58M| 3172 (19)| 00:00:39 |
| 5 | HASH GROUP BY | | 2920K| 58M| 3172 (19)| 00:00:39 |
| 6 | TABLE ACCESS FULL | USAGE2 | 2920K| 58M| 2686 (5)| 00:00:33 |
|* 7 | SORT JOIN | | 1 | 124 | 4 (25)| 00:00:01 |
| 8 | MAT_VIEW ACCESS BY INDEX ROWID| USERS | 1 | 124 | 3 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN | USERS_KT | 1 | | 1 (0)| 00:00:01 |
| 10 | VIEW | USAGE_MIN | 398 | 8358 | 3605 (19)| 00:00:44 |
| 11 | HASH GROUP BY | | 398 | 5970 | 3605 (19)| 00:00:44 |
| 12 | TABLE ACCESS FULL | USAGE | 3174K| 45M| 3073 (4)| 00:00:37 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("USAGE_MAX"."user_id"="USAGE_MIN"."user_id")
7 - access("USERS"."user_id"="USAGE_MAX"."user_id")
filter("USERS"."user_id"="USAGE_MAX"."user_id")
9 - access("USERS"."group_id"='1212882339')
- What are the query plans?
- Are you measuring the time to fetch the first row? Or are you measuring the time to fetch the last row?
At a minimum, I'd want to hit the USAGE
table just once using something like
select User.UserId, maxAmount, Total from
Users
JOIN
( select UserId,
max(Amount) as maxAmount,
sum(Amount) as Total
from USAGE group by UserId ) A
ON User.UserId = A.UserId
WHERE User.GroupId = 'some_group_id'
or even more simply
SELECT user.userId,
max(usage.Amount) maxAmount,
sum(usage.Amount) total
FROM user
join usage on (user.userId = usage.userId)
WHERE user.GroupId = 'some_group_id'
GROUP BY user.userId
精彩评论