How to write a database view that expands data into multiple rows?
I have a database table that contains collection data for product collected from a supplier and I need to produce an esti开发者_JAVA技巧mate of month-to-date production figures for that supplier using an Oracle SQL query. Each day can have multiple collections, and each collection can contain product produced across multiple days.
Here's an example of the raw collection data:
Date Volume ColectionNumber ProductionDays 2011-08-22 500 1 2 2011-08-22 200 2 2 2011-08-20 600 1 2
Creating a month-to-date estimate is tricky because the first day of the month may have a collection for two days worth of production. Only a portion of that collected volume is actually attributable to the current month.
How can I write a query to produce this estimate?
My gut feeling is that I should be able to create a database view that transforms the raw data into estimated daily production figures by summing collections on the same day and distributing collection volumes across the number of days they were produced on. This would allow me to write a simple query to find the month-to-date production figure.
Here's what the above collection data would look like after being transformed into estimated daily production figures:
Date VolumeEstimate 2011-08-22 350 2011-08-21 350 2011-08-20 300 2011-08-19 300
Am I on the right track? If so, how can this be implemented? I have absolutely no idea how to do this type of transformation in SQL. If not, what is a better approach?
Note: I cannot do this calculation in application code since that would require a significant code change which we can't afford.
try
CREATE TABLE TableA (ProdDate DATE, Volume NUMBER, CollectionNumber NUMBER, ProductionDays NUMBER);
INSERT INTO TableA VALUES (TO_DATE ('20110822', 'YYYYMMDD'), 500, 1, 2);
INSERT INTO TableA VALUES (TO_DATE ('20110822', 'YYYYMMDD'), 200, 2, 2);
INSERT INTO TableA VALUES (TO_DATE ('20110820', 'YYYYMMDD'), 600, 1, 2);
COMMIT;
CREATE VIEW DailyProdVolEst AS
SELECT DateList.TheDate, SUM (DateRangeSums.DailySum) VolumeEstimate FROM
(
SELECT ProdStart, ProdEnd, SUM (DailyProduction) DailySum
FROM
(
SELECT (ProdDate - ProductionDays + 1) ProdStart, ProdDate ProdEnd, CollectionNumber, VolumeSum/ProductionDays DailyProduction
FROM
(
Select ProdDate, CollectionNumber, ProductionDays, Sum (Volume) VolumeSum FROM TableA
GROUP BY ProdDate, CollectionNumber, ProductionDays
)
)
GROUP BY ProdStart, ProdEnd
) DateRangeSums,
(
SELECT A.MinD + MyList.L TheDate FROM
(SELECT MIN (ProdDate - ProductionDays + 1) MinD FROM TableA) A,
(SELECT LEVEL - 1 L FROM DUAL CONNECT BY LEVEL <= (SELECT Max (ProdDate) - MIN (ProdDate - ProductionDays + 1) + 1 FROM TableA)) MyList
) DateList
WHERE DateList.TheDate BETWEEN DateRangeSums.ProdStart AND DateRangeSums.ProdEnd
GROUP BY DateList.TheDate;
The view DailyProdVolEst
gives you dynamically the result you described... though some "constraints" apply:
- the combination of
ProdDate
andCollectionNumber
should be unique. - the
ProductionDays
need to be > 0 for all rows
EDIT - as per comment requested:
How this query works:
It finds out what the smallest + biggest date in the table are, then builds rows with each row being a date in that range (DateList
)... this is matched up against a list of rows containing the daily sum for unique combinations of ProdDate Start/End (DateRangeSums
) and sums it up on the date level.
What do SUM (DateRangeSums.DailySum)
and SUM (DailyProduction)
do ?
Both sum things up - the SUM (DateRangeSums.DailySum)
sums up in cases of partialy overlapping date ranges, and the SUM (DailyProduction)
sums up within one date range if there are more than one CollectionNumber
. Without SUM
the GROUP BY
wouldn't be needed.
I think a UNION query will do the trick for you. You aren't using the CollectionNumber field in your example, so I excluded it from the sample below.
Something similar to the below query should work (Disclaimer: My oracle db isn't accessible to me at the moment):
SELECT Date, SUM(Volume) VolumeEstimate
FROM
(SELECT Date, SUM(Volume / ProductionDays) Volume
FROM [Table]
GROUP BY Date
UNION
SELECT (Date - 1) Date, SUM(Volume / 2)
WHERE ProductionDays = 2
GROUP BY Date - 1)
GROUP BY Date
It sounds like what you want to do is sum up by day and then use a tally table to divide out the results.
Here's a runnable example with your data in T-SQL dialect:
DECLARE @tbl AS TABLE (
[Date] DATE
, Volume INT
, ColectionNumber INT
, ProductionDays INT);
INSERT INTO @tbl
VALUES ('2011-08-22', 500, 1, 2)
, ('2011-08-22', 200, 2, 2)
, ('2011-08-20', 600, 1, 2);
WITH Numbers AS (SELECT 1 AS N UNION ALL SELECT 2 AS N)
,AssignedVolumes AS (
SELECT t.*
, t.Volume / t.ProductionDays AS PerDay
, DATEADD(d, 1 - n.N, t.[Date]) AS AssignedDate
FROM @tbl AS t
INNER JOIN Numbers AS n
ON n.N <= t.ProductionDays
)
SELECT AssignedDate
, SUM(PerDay)
FROM AssignedVolumes
GROUP BY AssignedDate;
I dummied up a simple numbers table with only 1 and 2 in it to perform the pivot. Typically you'll have a table with a million numbers in sequence.
For Oracle, the only thing you should need to change would be the DATEADD.
精彩评论