Is it bad to rely on foreign key cascading?
The lead developer on a project I'm involved in says it's bad practice to rely on cascades to delete related rows.
I don't see ho开发者_Go百科w this is bad, but I would like to know your thoughts on if/why it is.
I'll preface this by saying that I rarely delete rows period. Generally most data you want to keep. You simply mark it as deleted so it won't be shown to users (ie to them it appears deleted). Of course it depends on the data and for some things (eg shopping cart contents) actually deleting the records when the user empties his or her cart is fine.
I can only assume that the issue here is you may unintentionally delete records you don't actually want to delete. Referential integrity should prevent this however. So I can't really see a reason against this other than the case for being explicit.
I would say that you follow the principle of least surprise.
Cascading deletes should not cause unexpected loss of data. If a delete requires related records to be deleted, and the user needs to know that those records are going to go away, then cascading deletes should not be used. Instead, the user should be required to explicitly delete the related records, or be provided a notification.
On the other hand, if the table relates to another table that is temporary in nature, or that contains records that will never be needed once the parent entity is gone, then cascading deletes may be OK.
That said, I prefer to state my intentions explicitly by deleting the related records in code, rather than relying on cascading deletes. In fact, I've never actually used a cascading delete to implicitly delete related records. Also, I use soft deletion a lot, as described by cletus.
I never use cascading deletes. Why? Because it is too easy to make a mistake. Much safer to require client applications to explicitly delete (and meet the conditions for deletion, such as deleting FK referred records.)
In fact, deletions per se can be avoided by marking records as deleted or moving into archival/history tables.
In the case of marking records as deleted, it depends on the relative proportion of marked as deleted data, since SELECT
s will have to filter on 'isDeleted = false
' an index will only be used if less than 10% (approximately, depending on the RDBMS) of records are marked as deleted.
Which of these 2 scenarios would you prefer:
Developer comes to you, says "Hey, this delete won't work". You both look into it and find that he was accidently trying to delete entire table contents. You both have a laugh, and go back to what you were doing.
Developer comes to you, and sheepishly asks "Do we have backups?"
There's another great reason to not use cascading UPDATES or DELETES: they hold a serializable lock. Holding a serializable lock can kill performance.
Another huge reason to avoid cascading deletes is performance. They seem like a good idea until you need to delete 10,000 records from the main table which in turn have millions of records in child tables. Given the size of this delete, it is likely to completely lock down all of the table for hours maybe even days. Why would you ever risk this? For the convenience of spending ten minutes less time writing the extra delete statements for one record deletes?
Further, the error you get when you try to delete a record that has a child record is often a good thing. It tells you that you don't want to delete this record becasue there is data that you need that you would lose if you did so. Cascade delete would just go ahead and delete the child records resulting in loss of information about orders for instance if you deleted a customer who had orders in the past. This sort of thing can thoroughly mess up your financial records.
I was likewise told that cascading deletes were bad practice... and thus never used them until I came across a client who used them. I really didn't know why I was not supposed to use them but thought they were very convenient in not having to code out deleting all the FK records as well.
Thus I decided to research why they were so "bad" and from what I've found so far their doesn't to appear to be anything problematic about them. In fact the only good argument I've seen so far is what HLGLEM stated above about performance. But as I am usually not deleting this number of records I think in most cases using them should be fine. I would like to hear of any other arguments others may have against using them to make sure I've considered all options.
I'd add that ON DELETE CASCADE makes it difficult to maintain a copy of the data in a data warehouse using binlog replication which is how most commercial ETL tools work. Explicit deletion from each table maintains a full log record and is much easier on the data team :)
I actually agree with most of the answers here, YET not all scenarios are the same, and it depends on the situation at hand and what would be the entropy of that decision, for example:
If you have a deletion command for an entity that has multiple many/belong relationships with a large number of entities, each time you would call that deletion process you would also need to remember to delete all the corresponding FKs from each relational pivot that A has corrosponding relationships with.
Whereas via a cascade on delete, you write that once as part of your schema and it will ONLY delete those corresponding FKs and cleanup the pivots from relations that are no longer necessary, imagine 24 relations for an entity + other entities that would also have large number of relations on top of that, again, it really depends on your setup and what YOU feel comfortable with. In anycase just for FYIs, in an Illuminate migration schema file, you would write it as such:
$table->dropForeign(['permission_id']);
$table->foreign('permission_id')
->references('id')
->on('permission')
->onDelete('cascade');
精彩评论