Database Design: to EAV or not to EAV?
Say I have an entity that will have many attributes, some I know about now and others will be user defined. What's the best way to model this?
1) Do I have a main table and relate it to a secondary name-value pair table? All the attributes go in the secondary EAV table.
- OR -
2) Do I put the most common attributes (not all users will need them, so I expect a lot of NULL entries) in the main table and have the secondary EAV ta开发者_如何转开发ble for the user defined attributes?
- OR -
3) Some other approach I have not thought of?
You may use solution two for efficiency reason, in particular if you need to select often on these quantities. These values may be "cache" of the EAV table, if you want. You introduce duplication but speed up lookup.
EAV is a good solution for this problem unless you have to perform joins at the db level. An alternative is to move away from the relational model and move to a RDF based model.
Typically, lots of empty cells are cheap and not worth normalizing away. The only draw back to #2 is if you have a very large number of rows (millions - where performance problems could arise), a very large number of columns (more than about 20 - where it's just annoying to look at the data), or there are a number of unique constraints on the EAV table.
With that said, it is now 2011 and it makes sense to use a programming framework with a database abstraction layer these days so that you're not designing database relationships directly. Something like Django's Object Relational Mapper allow you to focus on the models themselves and let best practices take care of themselves (95% of the time). This tutorial will help you get started. Django only applies to web development database modeling. For non-web environments, other frameworks will be better.
I've done a lot of work with the EAV pattern, and it has served the purpose well enough. I find empty columns, or dynamic columns (like col1, col2, etc) to be much harder to deal with manage after the fact, but it can be easier to query them since you don't need as many joins.
One thing I would very strongly recommend is taking a look at options like Mongo DB. It automatically handles complex dynamic data structures.
精彩评论