SQL query to show difference from the same table
My application has a table that contains snapshot inventory data from each year. For example, there's a vehicle inventory table with the typical columns vehicle_id, vehicle_plate_num, vehicle_year, vehicle_make, etc, but also the year designating that the vehicle is owned.
Querying the entire table might result in something like this:
Id Plate Num Year Make Model Color Year Owned
---------------------------------------------------------
1 AAA555 2008 Toyota Camry blue 2009
2 BBB666 2007 Honda Accord black 2009
3 CCC777 1995 Nissan Altima white 2009
4 AAA555 2008 Toyota Camry blue 2010
5 BBB666 2007 Honda Accord black 2010
6 DDD888 2010 Ford Explorer white 2010
(Good or bad, this table already exists and it is not an option to redesign the table and that's a topic for another question). What you see here is year after year, the majority of the vehicles are still in the inventory, but there's always the situation where old ones are getting rid of, and new vehicles are acquired. In the example above, the 1995 Nissan Altima was in the 2009 inventory but no longer in t开发者_运维知识库he 2010 inventory. The 2010 inventory has a new 2010 Ford Explorer.
How can I build an efficient query that takes any two years and show only the difference. For example, if I pass in 2009, 2010, the query should returns
3 CCC777 1995 Nissan Altima white 2009
If I pass in 2010, 2009, the query should return
6 DDD888 2010 Ford Explorer white 2010
Edit: I should have added the comment following the answer from Kyle B., but the text area for comment is not very user-friendly:
I didn't think it would be this tough, but seems to be.
Anyway, wouldn't you need a sub-select from the above like this:
select q.* from (
select f.*
from inventory f
left join inventory s
on (f.plate_num = s.plate_num
and f.year_owned = :first-year
and s.year_owned = :second-year)
where s.plate_num is null
) q
where q.year_owned = :second_year
You want a self-outer join
It looks like you want the asymmetric difference. If you wanted the symmetric difference, you'd use a full outer join instead of a left (or right) outer join.
With variables :first-year and :second-year
select f.*
from inventory f
left join inventory s
on (f.plate_num = s.plate_num
and s.year_owned = :second-year)
where s.plate_num is null
and f.year_owned = :first-year
Note that the condition has to be inside the join condition, so that database will return a null row when there's no match instead of finding a match that later gets removed by filtering.
Edit: Adjusted query slightly. This doesn't require a sub-select. Tested with postgresql.
select a.id, a.platenum, a.year, a.make, a.model, a.color, b.yearowned
from inventory a
join inventory b on a.platenum=b.platenum
where a.yearowned=___ and b.yearowned=___;
Edit: oops, I misunderstood. How do I delete my answer?
This query will select all cars from 2010 that did not exist in the table in previous years.
select *
from cars
where Year_Owned = 2010
and plate not in (
select plate
from cars
where year_owned < 2010);
Using this structure, it should be obvious how to rearrange it to produce the cars that no longer exist in 2010.
I am not sure how 'efficient' this idea is going to be; however you can probably use the 'EXCEPT' SQL statement. Just a sample, this won't return the complete row you want, however you will get the idea:
select plate, name from inventory where year_owned=2009
except
select plate, name from inventory where year_owned=2010
I think Kyle Butt gave the almost perfect answer. He got me 90% of the way.
Here's the answer:
Query all vehicles that are in 2010's but NOT in 2009's inventory:
select q.* from (
select f.* from inventory f
left join inventory s
on (f.plate_num = s.plate_num
and f.year_owned = 2010
and s.year_owned = 2009)
where s.plate_num is null
) q
where q.year_owned = 2010
Query all vehicles that are in 2009's but NOT in 2010's inventory:
select q.* from (
select f.* from inventory f
left join inventory s
on (f.plate_num = s.plate_num
and f.year_owned = 2009
and s.year_owned = 2010)
where s.plate_num is null
) q
where q.year_owned = 2009
- Note the sub-query
- Runs fairly fast for 100,000+ records.
精彩评论