开发者

Database design for summarized data

I have a new table I'm going to add to a bunch of other summarized data, basically to take some of the load off by calculating weekly avgs.

My question is whether I would be better off wit开发者_如何学编程h one model over the other. One model with days of the week as a column with an additional column for price or another model as a series of fields for the DOW each taking a price.

I'd like to know which would save me in speed and/or headaches? Or at least the trade off.

IE.

ID   OBJECT_ID   MON  TUE  WED  THU  FRI  SAT  SUN   SOURCE

OR

ID   OBJECT_ID   DAYOFWEEK   PRICE   SOURCE


I would give the first preference to the following aggreate model:

ID | OBJECT_ID | DATE       | PRICE  | SOURCE 
---+-----------+------------+--------+--------
1  | 100       | 2010/01/01 | 10.00  | 0
2  | 100       | 2010/01/02 | 15.00  | 0
3  | 100       | 2010/01/03 | 20.00  | 0
4  | 100       | 2010/01/04 | 12.00  | 0

You would then be able to aggreate the above data to generate averages for every week/month/year very easily and relatively quickly.

To get the list of weekly averages, you would be able to do the following:

SELECT WEEK(date), AVG(price) FROM table GROUP BY WEEK(date);

For some further examples, the following query would return the average price on Sundays:

SELECT AVG(price) FROM table WHERE DAYOFWEEK(date) = 1;

Or maybe get the average daily price for the 8th week of the year:

SELECT AVG(price) FROM table WHERE WEEK(date) = 8;

It would also be quite easy to get monthly or yearly averages:

SELECT MONTH(date), AVG(price) FROM table GROUP BY MONTH(date);

I would only opt for more de-normalized options like the two you proposed if the above aggregations would still be too expensive to compute.


I would vote for the second. With the first, you would need some contraints to ensure that any row has only one of MON, TUE, WED, THU, FRI, SAT, SUN. Of course, with the second, you might need some additional reference data to define the Days of the Week, to populate DAYOFWEEK.

UPDATE:

Ok it wasn't clear there would always be a price for every day. In that case my point about constraints isn't so valid. I'd still prefer the second model though, it seems better normalized. I don't know enough about this case now to say if this is a good time to cast off some good normalization practices for clarity and performance, but it might be...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜