Fact Table with Different Update Schedules
I have two sets of data with the same level of grainularity, for example invoice number. Most of the data required is updated daily as we recognize the revenue for previous invoices. However, some of this data is fed through a seperate costing system once a month and is then fed to the data warehouse with additional information. Should I create one fact table that contains both sets of data, and then run an update on the fact table once a month when the other data is imported in, or should I create two fact tables because of the different update schedule. The data is related, and many queries (~35%) will want information from both sets of data (when avaliable). The system imports 30,000 rows a day into the fact table开发者_运维知识库 has about 38,000,000 rows in it, the monthly update would affect 660,000 rows.
Providing that already existing measures are not modified in the second step, you could treat the fact table as an "accumulating snapshot". The table describes processes with a definitive start and the end -- kind of workflows. Look it up in Kimball's Data Warehouse Toolkit or just Google "Kimball accumulating snapshot fact table".
精彩评论