sql joins - referencing a 3rd table but without duplicates
I'm really not sure how to even phrase the question. So I'll go by example. I currently use this SQL command to get a total of all sales:
SELECT sum(ot.value*o.currency_value)
FROM orders_total ot
LEFT JOIN orders o on o.orders_id = ot.orders_id
WHERE ot.class = 'ot_total'
AND o.date_purchased between '2010-01-01' and '2010-12-31'
And that works fine, but now I want to do a report that cross-references the above with a 3rd table (orders_products) to sum up only the orders containing a specific product id. So I tried this:
SELECT sum(ot.value*o.currency_value)
FROM orders_total ot
(LEFT JOIN orders o on o.orders_id = ot.orders_id)
LEFT JOIN orders_products op on ot.orders_id = op.orders_id
WHERE ot.class = 'ot_total'
AND o.date_purchased between '2010-01-01' and '2010-12-31'
AND op.products_id = 321
But that gives me a higher-than-expected total. Investigating manually, I discovered the obvious reason is that any given order can (of course) have more than one product.
I'd like to show an example but I can't do tables here it seems.
Q: How do I sum up a total value without getting duplicate records from orders matching multiple entries in the op table?
Does that make any sense at all?
Edit:
I feel like I'm sort of onto something with this:
SELECT distinct o.orders_id
FROM orders o
JOIN orders_products op on o.orders_id = op.orders_id
WHERE o.date_purchased between '2010-01-01' and '2010-12-31'
AND op.products_id = 321
That results in a listing of all orders_id represented by the product in question. So now I need to sort of inject that result into another statement summing up the values? But how??
Edit:
Here's an attempt to show my tables:
orders_total
orders_total_id orders_id class value
--------------- --------- ---------- -------
1 1 ot_sub 100
2 1 ot_shipping 10
3 1 ot_total 110
4 2 ot_sub 200
5 2 ot_shipping 10
6 2 ot_total 210
7 3 ot_sub 50
8 3 ot_shipping 5
9 3 ot_sub 55
orders
orders_id currency_value date_purchased
--------- -------------- --------------
1 1.0000 2010-04-20
2 1.0000 2010-05-05
3 1.0000 2010-06-01
orders_products
orders_products_id orders_id products_id
------------------ --------- -----------
1 1 321
2 2 555
3 2 132
4 2 321
5 3 132
So I want an SQL statement that will give me a result of 320 (total of all orders containing product ID 321, which is orders 1 and 2 but not 3; "value" of "ot_total" for 1 is 110 and for 2 is 210. 110 + 210 = 320).
EDIT/SOLUTION:
Thanks开发者_Python百科 to JNK for turning me on to EXISTS. As it turns out, this did the job nicely:
SELECT sum(ot.value*o.currency_value) FROM orders_total ot LEFT JOIN orders o ON o.orders_id = ot.orders_id WHERE EXISTS (SELECT NULL FROM orders_products op WHERE op.products_id = 321 AND op.orders_id = o.orders_id) and o.date_purchased between '2010-01-01' and '2010-12-31' and ot.class = 'ot_total'
Use EXISTS
- this is a perfect use case.
SELECT <all your fields>
FROM table
LEFT JOIN orders o
ON table2.key = table.key
WHERE EXISTS (SELECT NULL
FROM orders_products op
WHERE op.products_id = xxx
AND op.orderid = o.orderid)
This will do a short-circuit comparison on the subquery. If the row in the outer query matches, it gets included. If not, it's not in the final result set.
Do it step by step, and don't use left outer join unless you really need to - you don't here.
List the orders which included product 321:
SELECT DISTINCT Orders_ID FROM Orders_Products WHERE Products_ID = 321
List the component data for each relevant order:
SELECT OT.Orders_ID, OT.Value, O.Currency_Value FROM Orders_Total AS OT JOIN Orders AS O ON OT.Orders_ID = O.Orders_ID WHERE O.Orders_ID IN (SELECT DISTINCT Orders_ID FROM Orders_Products WHERE Products_ID = 321 ) AND OT.Class = 'ot_total' AND O.Date_Purchased BETWEEN '2010-01-01' AND '2010-12-31'
Do the summation you want.
SELECT SUM(OT.Value * O.Currency_Value) AS TotalOrderValue FROM Orders_Total AS OT JOIN Orders AS O ON OT.Orders_ID = O.Orders_ID WHERE O.Orders_ID IN (SELECT DISTINCT Orders_ID FROM Orders_Products WHERE Products_ID = 321 ) AND OT.Class = 'ot_total' AND O.Date_Purchased BETWEEN '2010-01-01' AND '2010-12-31'
There are other ways to write it - there almost always are many ways to write a query in SQL. The DISTINCT is not really necessary in the sub-query.
As long as each order only has a single row in Orders_Products for any given Products_ID and you are only interested in the orders for a single part (not a list of parts), then you can modify the SQL into a more direct triple-join instead of using the sub-select:
SELECT SUM(OT.Value * O.Currency_Value) AS TotalOrderValue
FROM Orders_Total AS OT
JOIN Orders AS O ON OT.Orders_ID = O.Orders_ID
JOIN Orders_Products AS OP ON OP.Orders_ID = O.Orders_ID
WHERE OP.Products_ID = 321
AND OT.Class = 'ot_total'
AND O.Date_Purchased BETWEEN '2010-01-01' AND '2010-12-31'
However, if you need to select the values for orders containing any of a list of parts or other changes, then you're likely to find that the sub-query notation is easier to manage.
Not clear to me where the details that go into orders_total.value live in your tables? Usually you have something like:
select sum(op.qty*op.price) as Total
from orders o
JOIN orders_products op on o.orders_id = op.orders_id
where o.date_purchased between '2010-01-01' and '2010-12-31 23:59'
and op.products_id = 321
Or a full report like:
select p.ProductName, totals.Total
from (
select op.products_id, sum(op.qty*op.price) as Total
from orders o
join orders_products op on o.orders_id = op.orders_id
where o.date_purchased between '2010-01-01' and '2010-12-31 23:59'
group by op.products_id) totals
join Products p
on totals.products_id = p.products_id
精彩评论