开发者

SQL Alchemy - INSERT results of query

I'm looking for a way in SQLAlchemy to do a bulk INSERT whose rows are the result of a query. I know the session has the function add which can be used to add an individual object, but I can't seem to find how how it works开发者_运维技巧 with a subquery.

I know I could iterate over the results of the subquery and add them individually, but this would seem to be somewhat inefficient. In my case I am dealing with a potentially very large set of data that needs insertion.


I see following options:

  1. using SA Model: create underlying objects with data loaded from the database, add them to session and commit.
    • Pros: if you have any AS Model level validation, you are able to use it; also you can insert into multiple tables if you model objects are mapped to multiple tables (Joined-Table Inheritance); is RDBMS independent
    • Cons: most expensive
  2. using Insert statements: load data from the database into python and execute using Insert Expressions
    • Pros: somewhat faster when compared to 1.
    • Cons: still expensive as python structures are created; cannot directly handle Joined-Table Inheritance
  3. create data using solely RDBMS: bulk insert using RDBMS only bypassing SA and python altogether.
    • Pros: fastest
    • Cons: no business object validation performed; potentially RDBMS-specific implementation required

I would suggest either option 1) or 3).
In fact, if you do not have any object validation and you use only one RDBMS, I would stick to option 3).


I believe the only way to do this in SQLAlchemy is issue a raw SQL statement using Session.execute


Since this is the top result on google for this common question, and there is actually a much better solution solution available, here's the updated answer. You may use the Insert.from_select() method. It is, although otherwise hard to find, documented here.


A quick primer

When working with Table objects you could use something like:

>>> from sqlalchemy.sql import select
>>> stmt = TargetTable.insert().from_select([TargetTable.c.user_id, TargetTable.c.user_name],
                                            select([SrcTable.c.user_id, SrcTable.c.user_name]))

>>> print(stmt)
INSERT INTO "TargetTable" (user_id, user_name) SELECT "SrcTable".user_id, "SrcTable".user_name
FROM "SrcTable"

Finally execute with engine.execute(stmt) or the like.

The final output statement is compiled by SQLAlchemy depending on the dialect used in the engine. Here I used the SQLite dialect.

This successfully avoids loading any data into python objects, and let's the database engine efficiently handle everything. Hurray!

As opposed to using textual sql statements with text(), this method is also RDBMS independent, because it still uses the SQLAlchemy Expression Language as described here. This language makes sure to compile to the right dialect when executed.

Using ORM Tables

The original question points to a use case where the ORM is used for interaction with the database. You probably defined your tables using the ORM base as well. Metadata, as stored in these objects, works just a little different now. So we'll modify the example a bit:

>>> from sqlalchemy.sql import select, insert
>>> stmt = insert(TargetTable).from_select([TargetTable.user_id, TargetTable.user_name],
                                           select([SrcTable.user_id, SrcTable.user_name]))
>>> engine.execute(stmt)
INFO sqlalchemy.engine.base.Engine INSERT INTO "TargetTable" (user_id, user_name) SELECT "SrcTable".user_id, "SrcTable".user_name
FROM "SrcTable"

Well look at that. It actually even made it a little simpler.

And it will be much faster.


P.S. Here's another secret from the docs. Want to use sql WITH statements in the same dynamic way? You can do it with "CTE's"


Given that the rows are the result of a query you can try INSERT INTO SELECT, this way the rows are never transferred to the client. And don't forget the autocommit=True:

from sqlalchemy.sql import text

query_text = text(
    "INSERT INTO dest_table (col1, col2) SELECT col3, col4 FROM src_table"
)
with engine.connect().execution_options(autocommit=True) as conn:
    rs = conn.execute(query_text)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜