Database Design: Flexible Storage of Q&A
I haven't designed a database in a while and I don't have a lot of confidence in my design right now. I basically have three tables in a database that represent a history of sorts. I have to change this system so that each time a record is added (e.g. something becomes historical), some input from the user may be required. Sometimes they will just have one simple question, other times they will be required to give five different pieces of information, and these need to be flexibly managed by an end user through an administrative front-end. So I plan to have a Question table and Answer table, with composite tables to tie these to the other three tables. I am struggling with the design the Answer table, because each question could require a variety of responses. Some responses will be drop-down selections that tie to another database via foreign key. Others could be text input, dates, or Yes/No answers. Right now, I have an answer table that encompasses all types of answers with a series of nullable fields...
+----------------------------+
| Answer |
+----------------------------+
| Id (int) |
| QuestionId (int) |
| ForeignKeyId1* (int) |
| ForeignKeyId2* (int) |
| ForeignKeyId3* (int) |
| Number* (bigint) |
| DateField* (date) |
| Text* (varchar 500) |
| YesNo* (bit) |
+----------------------------+
*Nullable
Side note: the question table ties to a QuestionType that will dictate (within the application) how to validate the user input. The input gets stored in the appropriate nullable field in the Answ开发者_C百科er record, with the other fields nulled out. I thought this would be better than having a catch all varchar answer field with no data integrity.
Is this a poor design? What could make it better?
From what you've described, your solution looks like it should work fine. I've done similar setups with tables that have columns set up to record specific types of data, allowing the other fields to be nullable. Looks good to me, although I'm not sure what your ForeignKeyID fields are for :\
Having different fields for different data types is OK, as is having multiple foreign keys. This design pushes as much of the data domain integrity as possible into the DBMS. This is philosophically a good way to go, as most people would tell you that letting the DMBS do the work for you will save you writing and maintaining code. I would be one of those people most of the time.
There is another way to look at this from a maintainability aspect. As it stands, you will have to go back and modify the structure of your ANSWER table if you add either a new kind of data type, like a floating point number or a GUID for example. Similarly, you'll have to go back and add another FK field if you make another question type that requires an answer from a new lookup table.
You could change your ANSWER table to look to use the catch-all approach so it would look more like this:
ANSWER
( Id (int)
, QuestionId (int)
, Part (int)
, Value (nvarchar 1000)
)
If I understood your question correctly, a question could have a multi-part answer. Assuming that there are potentially multiple answers for a single question, and that your question control tables know which part is which, you would use the Part field to distinguish the parts. If each question has exactly zero or one answers, then you don't need an answer table, you can just add the answer to the question.
So, why might you want to give up on letting the database enforce data domain integrity? Here are the relative pros and cons:
Advantages of Multiple Separate Answer Fields:
- DBMS won't let you put the wrong kind of data into a field, so you don't need to write or call any data domain integrity functions before storing the data (unless you want to trap these errors at the GUI level)
- There's no confusion about how to interpret the contents of an answer (e.g. which is the month and which is the day in this date answer...)
Advantages of a Catch All Answer Field:
- You don't have to write extra code to look at the question type to find out which column in the ANSWER table to read for any given type of question.
- You don't have to change your database schema/physical database population every time a new lookup table or answer data type is added.
In either case you have to write some code to handle the fact that you are storing semantically different answers to multiple questions in one data store. You have to decide which type of extra code you want to write, the kind that figures out where to put and find each type of answer, or the kind that figures out how to store and interpret different data types using a common representation (i.e. string).
Given that modifying a table which has data in it is a pain, and given that most programming languages have pretty robust .ToString() / .TryParse() type functionality built in, I would lean towards using the catch-all approach if my primary concern were maintainability.
精彩评论