开发者

Database design - duplicating columns for the purpose of reporting

I have the following tables:

Product(ProductID, ProductName ...)
ProductBidHistory(ProductID, UserID, Amount, Status, ...)

The BidHistory table can grow to have many records for every product, I want a report containing approved bid for every products i.e.Amount where status = approved.

  • Is it ok to have an ApprovedAmount column in the Product table itself that gets populated when the bid开发者_如何学C is closed, so that it is easier for reporting.
  • Should the report be written to do a join to history table to look for 'approved' and fetch the amount.

Is it ok to duplicate the column ?


From the pure design perspective, you shouldn't keep two copies of the same data. This could lead to data inconsistency. What if the approved amount and the bid history do not agree?

From the performance perspective, you may duplicate the data to gain the speedup in the generation of the report.

From the application perspective, the developer has to ensure the proper update is done to the approved amount in Product table with every update in the Bid History table.


This realy depends on the size of these tables.

If they are not to large, I would recomend that you not duplicate the columns, as this will cause more overhead in the application layer. A simple view or user defined table function should be good enough. Also ensure that you have indexes on the correct columns.

But if the tables are going to be very large (millions of rows), then you might see performance gains by storing the values for reporting purposes.


In general, you should never duplicate the columns in a relational database, so that it would be possible to get answers that do not jibe with each other depending upon how the (valid) queries were expressed.

Approved is not an attribute of the product but an attribute of the bid, and you should always try to have the database structure mirror the real-world ontology.

Of course, you could not put Approved yes/no (boolean) column in the Product table if the rows in that table do not refer to a specific product in someone's possession (e.g. iPad 16G S/N 123456789) but to a generic "catalog description" product (e.g. iPad 16G) and could appear in multiple auctions. Not clear from your simplified design for the purpose of asking the question what you had in mind here.

In your queries, you're never going to ask simply for "all approved bids" with no other qualifiers. The term "approved=true" will always be in league with some other term(s), e.g. bid-date after ?, or product id = ?, or product-category in (a, b, c) which are likely candidate columns for indices. So use indices judiciously, don't violate the normalization rules, keep your ontology real, and your database will take care of you.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜