开发者

Architecture of SQL tables

I am wondering is it more useful and practical (size of DB) to create multiple tables in sql with two columns (one column containing foreign key and one column containing random data) or merge it and create one table containing multiple columns. I am asking this because in my scenario one product holding primary key could have sufficient/applicable data for only one column while other columns would be empty.

example a. one table
productID productname weight开发者_如何学运维 no_of_pages
1         book        130     500
2         watch       50      null
3         ring        null    null

example b. three tables
productID productname
1         book       
2         watch      
3         ring       

productID weight
1         130   
2         50    

productID no_of_pages
1         500


The multi-table approach is more "normal" (in database terms) because it avoids columns that commonly store NULLs. It's also something of a pain in programming terms because you have to JOIN a bunch of tables to get your original entity back.

I suggest adopting a middle way. Weight seems to be a property of most products, if not all (indeed, a ring has a weight even if small and you'll probably want to know it for shipping purposes), so I'd leave that in the Products table. But number of pages applies only to a book, as do a slew of other unmentioned properties (author, ISBN, etc). In this example, I'd use a Products table and a Books table. The books table would extend the Products table in a fashion similar to class inheritance in object oriented program.

All book-specific properties go into the Books table, and you join only Products and Books to get a complete description of a book.


I think this all depends on how the tables will be used. Maybe your examples are oversimplifying things too much but it seems to me that the first option should be good enough.

You'd really use the second example if you're going to be doing extremely CPU intensive stuff with the first table and will only need the second and third tables when more information about a product is needed.

If you're going to need the information in the second and third tables most times you query the table, then there's no reason to join over every time and you should just keep it in one table.


I would suggest example a, in case there is a defined set of attributes for product, and an example c if you need variable number of attributes (new attributes keep coming every now and then) -

example c

productID productName
1         book
2         watch
3         ring

attrID productID attrType    attrValue
1      1         weight      130
2      1         no_of_pages 500
3      2         weight      50

The table structure you have shown in example b is not normalized - there will be separate id columns required in second and third tables, since productId will be an fk and not a pk.


It depends on how many rows you are expecting on your PRODUCTS table. I would say that it would not make sense to normalize your tables to 3N in this case because product name, weight, and no_of_pages each describe the products. If you had repeating data such as manufacturers, it would make more sense to normalize your tables at that point.


Without knowing the background (data model), there is no way to tell which variant is more "correct". both are fine in certain scenarios.


You want three tables, full stop. That's best because there's no chance of watches winding up with pages (no pun intended) and some books without. If you normalize, the server works for you. If you don't, you do the work instead, just not as well. Up to you.

I am asking this because in my scenario one product holding primary key could have sufficient/applicable data for only one column while other columns would be empty.

That's always true of nullable columns. Here's the rule: a nullable column has an optional relationship to the key. A nullable column can always be, and usually should be, in a separate table where it can be non-null.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜