SQL/MySQL structure (Denormalize or keep relational)
I have a question about best practices related to de-normalization or table hierarchy relationships.
For a simple example, let's say I have an app that allows a user to make a payment for an order. I save the order information in the orders table, and I have another table for the payment called payments. Payments has a foreign key to the orders table.
Let's assume that I can pay with a credit card, check, or paypal, and I want to save the information about the payment.
My question is what is the best way to handle this relationship between the different payment data and the payment table. The types of payment all have different data associated with them. So do I denormalize the payments table, putting credit card, check, and paypal information fields in there and then ju开发者_如何学Pythonst use the fields as necessary. Alternately I could specify a payment type, and store the information in their own tables, but then I would have to use logic on an application level to get the data out of the correct credit card, check or paypal information tables...
I would choose to keep the database normalized.
but then I would have to use logic on an application level to get the data out of the correct credit card, check or paypal information tables...
You have to use logic (or at least mapping) in either case. Whether its what table to pull the data from or what fields in the table to access.
What about keeping it denormalized and then making a view to put the data back together again. You get the best of both worlds. IIRC, MySQL introduced views in version 5.
So do I denormalize the payments table, putting credit card, check, and paypal information fields in there and then just use the fields as necessary.
yes. but this is not "denormalizing". if you stored order information in the client table, that would be denormalizing. adding nullable columns to accurately describe a payment in the payments table is not.
You can use the idea of table per subclass as the ORM tools do. This would require a join for each query against the payment table but...
Create tables for each payment type so you will have a creditcardpayment and a checkpayment table. The common fields go in the payment table, the specific fields go in the sub tables. The sub tables primary keys are foreign keys to the payment table's id.
To add a new payment you have to first insert the common fields into the payment table, get the id generated, then insert the specific fields into the specific sub table.
To query you have to join the subtables with the payment table. You could use a view to make that easier.
This way the database is still normalized and you have no null columns.
It partially depends on the framework (if any) that you are using. For instance: the Ruby on Rails way would generally be to store the type of the payment in the payments table and then have different, separate tables for each payment type (PayPal, Credit Card, etc).
Alternatively, if you notice that you are repeating the same data in many of the tables, Rails has a way to store all of the data in the same table, using only the fields you need, but still allowing you to have separate objects. For instance, you would have an AbstractPayment object with an abstract_payments table, but you would also have PayPalPayment and CreditCardPayment objects that both inherit from AbstractPayment and use the abstract_payments table. All you need to determine the payment type is a column in abstract_payments that tells you which type it is (probably a string, but could be an integer if you so choose). This is called STI.
No matter what framework/language you use, the same ideas can definitely apply and I think the right solution will depend on how many different types of payments you have, compared with how simple you want your database to be.
Keep it as normalized as possible. Only de-normalize when the performance of a fully normalized schema requires denormalization to improve response time, and do that only on a case by case basis to deal with specific performance issues associated with individual querys within your application.
These are complex problems. Database Normalization requires intimate domain knowledge, and a skilled analysis of how that domain model will be manipulated and utilized within your application. Denormalizing for performance requires that you understand your application's usage patterns well enough to predict performance issues before they occur (waiting till they actually occur in production is too late - by then making fundemental schema changes in the database is very expensive) and know what denormalization techniques to use to address them.
You need to weight the following factors:
- How much space will you waste if you put all data into a single table
- How complex the SQL queries will become in either case.
If you use different tables, you'll have to use joins. If you put everything into a single table, you'll need to find some magic to "ignore" the rows which don't matter (say when you want to find all credit card payments: Your query must then ignore everything that's something else).
The latter part gets more easy when you move the special data into special tables at the cost of more complex joins.
精彩评论