What is a fast way of joining two tables and using the first table column to "filter" the second table?
I am trying to develop a SQL Server 2005 query but I'm being unsuccessful at the moment. I trying every different approach that I know, like derived tables, sub-queries, CTE's, etc, but I couldn't solve the problem. I won't post the queries I tried here because they involve many other columns and tables, but I will try to explain the problem with a simpler example:
There are two tables:
PARTS_SOLD
andPARTS_PURCHASED
. The first contains products that were sold to customers, and the second contains products that were purchased from suppliers. Both tables contains a foreign key associated with the movement itself, that contains the dates, etc.Here is the simplified schema:
Table PARTS_SOLD
:
part_id
date
- other columns
Table PARTS_PURCHASED
part_id
date
other columns
What I need is to join every row in
PARTS_SOLD
with a unique row fromPARTS_PURCHASED
, chose bypart_id
and the maximum "date
", where the "date
" is equal of before the "date
" column fromPARTS_PURCHASED
. In other words, I need to collect some information from the last purchase event for the item for every event of selling this item.
The problem itself is that I didn't find a way of joining the PARTS_PURCHASED
table with PARTS_SOLD
table using the column "date
" from PARTS_SOLD
to limit the MAX(date)
of the PART开发者_Python百科S_PURCHASED
table.
I could have done this with a cursor to solve the problem with the tools I know, but every table has millions of rows, and perhaps using cursors or sub-queries that evaluate a query for every row would make the process very slow.
You aren't going to like my answer. Your database is designed incorrectly which is why you can't get the data back out the way you want. Even using a cursor, you would not get good data from this. Assume that you purchased 5 of part 1 on May 31, 2010. Assume on June 1, you sold ten of part 1. Matching just on date, you would match all ten to the May 31 purchase even though that is clearly not correct, some parts might have been purchased on May 23 and some may have been purchased on July 19, 2008.
If you want to know which purchased part relates to which sold part, your database design should include the PartPurchasedID as part of the PartsSold record and this should be populated at the time of the purchase, not later for reporting when you have 1,000,000 records to sort through.
Perhaps the following would help:
SELECT S.*
FROM PARTS_SOLD S
INNER JOIN (SELECT PART_ID, MAX(DATE)
FROM PARTS_PURCHASED
GROUP BY PART_ID) D
ON (D.PART_ID = S.PART_ID)
WHERE D.DATE <= S.DATE
Share and enjoy.
I'll toss this out there, but it's likely to contain all kinds of mistakes... both because I'm not sure I understand your question and because my SQL is... weak at best. That being said, my thought would be to try something like:
SELECT * FROM PARTS_SOLD
INNER JOIN (SELECT part_id, max(date) AS max_date
FROM PARTS_PURCHASED
GROUP BY part_id) AS subtable
ON PARTS_SOLD.part_id = subtable.part_id
AND PARTS_SOLD.date < subtable.max_date
精彩评论