Storing dynamic form data in DBMS, looking for the optimal approach
While working on a project that will store a whole bunch of (completely different) forms I'm facing a design issue on how to store the values while keeping the database usable.
Brief description: each 'document' contains a variable amount of questions (though a consistent amount per type of document) and matching answers.
The most usable approach I've come up with is the following, here I've grouped documents by 'type', which identifies which questions belong to the document, which in return has the answers to the matching questions.
+---------------+ 1 n +-----------+
| DocumentType |----------| Questions |
+---------------+ Has many +-----------+
|1 1|
|n Is of type n| Belongs to
+---------------+ 1 n +-----------+
| DocumentEntry |----------| Answers |
+---------------+ Has many +-----------+
The drawback here is that queries on fetching the documents that have que开发者_Python百科stion A with answer B become rather complex and likely rather slow when the database grows larger, which it rapidly will.
I'm wondering if I've stumbled across the optimal approach to store the data or if there is some neat solution out there that I might've missed.
You've faced a common problem: Trying to use something static (database with predefined structure) for something dynamic (bunch of individual data sets which only have one thing in common: they come from forms). What you want is doable with databases but would be significantly easier to do without, however since I assume you really do want to use a database for this, here's what I'd do:
- You have a
document
(or questionnaire) which contains multiplequestions
. These both are generic enough and require their own tables, just as you've done so far. - Each question has a
type
which defines what kind of question it is (multiple select, freeform, select one...) and of course the question also hasoptions
. So that's two tables more. The reasoning here is that decoupling these from the actual question allows for certain level of abstraction exist and your queries will still be somewhat simple although possibly loooooong.
So, each document has 1..n to questions, each question has 1 type and 1..n options. Skipping a bit, here's what I'm thinking of with link tables etc.
Document
bigint id
DocumentQuestions
bigint document_id
bigint question_id
Question
bigint id
varchar question
QuestionType
bigint question_id
bigint type_id
Type [pre-filled table with id:type pairs, such as 1=freeform, 2=select one etc.]
QuestionOptions
bigint id
bigint question_id
varchar description
varchar value
Answers
bigint id
bigint document_id
[etc. such as user_id]
QuestionAnswers
bigint answer_id
bigint question_id
bigint questionoptions_id
This sort of design permits several things:
- Questions themselves are reusable, very handy if you're making a generic "answer these x random questions from a pool of y questions".
- New types can be added easily without breaking existing ones.
- You can always navigate through the structure quite easily, for example "What was the name of the document for this single question answer I have?" or "how many people have answered incorrectly to this one question?"
- Because types are separated, you can create a (web) UI which reflects the state in the database easily - better yet, if the type changes you may don't even have to touch your UI code at all.
- Since each possibly option for a question is its own row in the
QuestionOptions
table, you can get the actual value very easily.
The obvious problem with this is that it requires quite strict coupling between the tables for integrity and will be a pain to get running properly at start. Also since value
in the QuestionOptions
is varchar, you need to be able to parse stuff a lot and you may even want to introduce another field for conversion hints.
Hope this helps even though you wouldn't agree with my solution at all.
精彩评论