How do I select min/max dates from table 2 based on date in table 1 (without getting too much data from sums)
Related to a question I asked earlier here, I've found a problem which is eluding me (obviously).
The original question was how to select a min and max date from a daily table based on a monthly table where some daily table dates could be missing. Basically what I needed was columns containing the month date (always the first), the earliest date for that month in the daily table and the latest date for that month in the daily table.
So, if the last week of January and first week of February were missing from the daily table (and we otherwise had all the dates for January and February but no more), I needed:
MonthStart DayFirst DayLast
---------- ---------- ----------
2009-01-01 2009-01-01 2009-01-24
开发者_Go百科2009-02-01 2009-02-08 2009-02-28
The answer was:
select
m.date as m1,
min(d.date) as m2,
max(d.date) as m3
from monthly m
join daily d
on month(d.date) = month(m.date)
and year(d.date) = year(m.date)
group by m.date
order by m.date
which worked for the specs I gave.
Unfortunately, reality bites, and there are multiple records in the monthly table (and daily table) with the same date. Specifically:
- the dates are
2007-10-16
thru2007-10-30
(15 days),2007-11-01
thru2007-11-30
(30 days) and2007-12-01
thru2007-12-15
(15 days). - each date has six rows in both tables (because they each have a row for three system names and two periods.
The problem is that I sum()
a field in the monthly table and the new query is getting values that are much too large (compared to the previous query which did not have the join).
The aggregation changes the query to be:
select
m.date as m1,
sum(m.other_field), -- added this
min(d.date) as m2,
max(d.date) as m3
from monthly m
join daily d
on month(d.date) = month(m.date)
and year(d.date) = year(m.date)
group by m.date
order by m.date
I think the values are too high due to cross-joining going on since the figures for each month are out by a constant factor, depending on the number of days in the daily table for that month.
My question is this: how do I aggregate the field in the monthly table without that factor coming into play and still get the min/max dates from the daily table for that month?
If the monthly
table contains a single entry for each month, you can do simply this:
select
m.date as m1,
m.other_field,
min(d.date) as m2,
max(d.date) as m3
from monthly m
join daily d
on month(d.date) = month(m.date)
and year(d.date) = year(m.date)
group by m.date, m.other_field
order by m.date
otherwise:
select m1, sum(other_field), m2, m3
from (
select
m.date as m1,
m.other_field,
min(d.date) as m2,
max(d.date) as m3
from monthly m
join daily d
on month(d.date) = month(m.date)
and year(d.date) = year(m.date)
group by m.date, m.other_field) A
group by A.m1, A.m2, A.m3
order by A.m1
Update from pax: Try as I might, I could not get the join
solutions working properly - they all seemed to return the same wrong data as the original. In the end, I opted for a non-join
solution since it worked and performance wasn't a big issue, since the tables typically have 24 rows (for monthly) and 700 rows (for daily). I'm editing this answer and accepting it since (1) it actually helped a great deal in getting the correct solution for me; and (2) I'm loathe to write my own answer and claim the glory for myself.
Thanks for all your help. The following is what worked for me:
select
m.date as p1,
m.grouping_field as p2,
sum(m.aggregating_field) as p3,
(select min(date) from daily
where month(date) = month(m.date)
and year(date) = year(m.date)) as p4,
(select max(date) from daily
where month(date) = month(m.date)
and year(date) = year(m.date)) as p5
from
monthly m
group by
m.date, m.grouping_field
which gave me what I wanted:
P1 P2 P3 P4 P5
---------- ---- ---- ---------- ----------
2007-10-01 BoxA 12.3 2007-10-16 2007-10-30
2007-10-01 BoxB 13.6 2007-10-16 2007-10-30
2007-10-01 BoxC 7.4 2007-10-16 2007-10-30
2007-11-01 BoxA 20.3 2007-11-01 2007-11-30
2007-11-01 BoxB 24.2 2007-11-01 2007-11-30
2007-11-01 BoxC 21.7 2007-11-01 2007-11-30
2007-12-01 BoxA 6.9 2007-12-01 2007-12-15
2007-12-01 BoxB 6.4 2007-12-01 2007-12-15
2007-12-01 BoxC 6.9 2007-12-01 2007-12-15
You can group the months in a subquery:
select
m.mindate as m1,
m.sum_other_field,
min(d.date) as m2,
max(d.date) as m3
from (
select
month(date) as month,
year(date) as year,
sum(other_field) sum_other_field,
min(date) mindate
from monthly
group by month(date), year(date)
) m
join daily d
on month(d.date) = m.month
and year(d.date) = m.year
group by m.month
order by m.year
精彩评论