Referential Integrity - how to configure SQLAlchemy?

2023-04-05 08:10 问答作者：

Can someone help me get the big p开发者_如何学Pythonicture when it comes to configuring SQLAlchemy models, so referential integrity is ensured at all levels?

Going by the idea that referential integrity should be expressed and enforced by the DB, I have created a schema (currently in Postgresql) with all the constraints I think I need, thus giving me a database I trust will enforce referential integrity.

I then begin to build an app on top of this DB, using SQLAlchemy (0.7) in declarative mode.

Having searched and read a bit, I have learned that I can configure:

onupdate/ondelete rules on my Column() definitions.
cascade options on my relationship() definitions,
and that these seem to operate on the session level in SQLAlchemy.
passive_deletes and passive_updates options for my relationship() definitions.

And that all these options have defaults.

But I am left confused as to how much I actually need to do with my SQLAlchemy models, to make sure SQLAlchemy doesn't get out of sync with the DB and its constraints during a session.

What exactly am I achieving, if I configure 'onupdate' etc. on my Columns() definitions in SQLAlchemy?

And for the cascade and passive_delete/passive_update rules I can configure on a relationship(). What do I need here, and why?

Or to rephrase my question: To what extend will SQLAlchemy be aware of the constraints configured in the DB schema, and to what extend (and how) do I have to repeat them in my models?

And are the anything else I should be aware of? :)

SQLAlchemy doesn't fundamentally need to know anything about the constraints your database has. If your database has the constraints you want configured, you're essentially done - your application is disallowed by the database from doing anything it's not supposed to.

A key theme of SQLAlchemy is that it is really only doing what you tell it to. So if you were to try persisting an object, SubWidget(), which in the database needs to have a reference to a parent Widget(), at the point at which SQLAlchemy flushes the data (i.e. emits INSERT statements), the operation would fail with a constraint violation, emitted by the database, and the transaction is rolled back.

So assuming an FK on "subwidget" referencing "widget", your application needs to ensure the data is in the correct structure. There are two ways to do this; one is that you hand-maintain those columns which contain foreign key references, and ensure they have the appropriate value at the point of INSERT or UPDATE. The other is that you'd use relationship() to manage the foreign key attribute, and you'd instead ensure that the creation of a SubWidget() object is accompanied by the operation of associating it with a parent Widget() object which you've created and/or acquired separately.

Regarding cascades, it's a great idea, though not required, to have ON DELETE CASCADE on those foreign keys where it applies. On the SQLAlchemy side, when using relationship() you usually want to give the ORM a hint that the database will cascade deletes via the passive_deletes flag ( http://www.sqlalchemy.org/docs/orm/collections.html?highlight=passive_deletes#using-passive-deletes ), however this is usually a performance enhancement; SQLAlchemy otherwise ensures that all objects represented on the dependent side of the relationship() are loaded into memory, and handled appropriately, which means one of either setting the foreign key attribute to NULL (the default), or marking the dependent object for deletion (which occurs by setting "cascade" to "all, delete-orphan", see http://www.sqlalchemy.org/docs/orm/session.html#cascades).

ON UPDATE cascade is less common as natural primary keys are not a common practice these days, since they don't actually perform as well as plain integer primary keys and can also be cumbersome in other ways. SQLAlchemy does support these as well however and they will generally take care of themselves, as SQLA assumes by default that update cascades are in place when a PK mutation takes place, see http://www.sqlalchemy.org/docs/orm/relationships.html#mutable-primary-keys-update-cascades for a detailed description of that.

Perhaps this is all easier to grok with a little experimentation, the basic idea is that SQLAlchemy is only emitting the SQL you're telling it to, even though many of its SQL behaviors are automated once configured up front. relationship() should be configured with details on how you'd like it to behave when data is persisted, modified, or deleted with respect to the constraints present in the database.

So, building on zzzeeks answer, and my own studying/tinkering after my initial question...

To make SQLAlchemy actively prevent that an in-session view of the DB state can divert from what the DB will allow when you flush/commit, you must mirror all the constraints found in the DB schema in your SQLAlchemy models.

And this is done via the Column definitions in the form of:

ForeignKey(..., onupdate='', ondelete='')
primary_key=True
unique=True

and so on, with the possible inclusion of __table_args__, like:

__table_args__ = (
        ForeignKeyConstraint(['id'], ['remote_table.id']),
        UniqueConstraint('foo'),
        )

For cases where the constraint spans multiple columns.

Whereas:

relationship()

and its related arguments, like:

cascade
cascade_backrefs
foreign_keys
passive_deletes
passive_updates

and so on, are a (important) convenience feature that allows you to work with your models with as little effort as possible, but ultimate aren't meant to be what prevents breaching referential integrity.

The relationship() feature can not express all the typical constraints in a database, while the Column() (and __table_args__) feature can.

On the other hand, configuring the relationship() with some of the arguments listed above (or going by default values where that makes most sense), will let SQLAlchemy perform tasks automatically, that ultimately can be said to be referential integerity related. And would otherwise normally have to be expressed by logic in the surrounding code.

Optimal configuration of relationship() will also in some cases avoid unnecessary SQL statements to be issued by SQLAlchemy.

Hope this summation is somewhat accurate...

继续阅读：referential-integrity

Referential Integrity - how to configure SQLAlchemy?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？