converting native SQL to SQLAlchemy

2023-04-11 02:54 问答作者：

I've got the following problen: I know SQL and I don't know how to work with SQLAlchemy but I ne开发者_开发知识库ed to change it in 1 place in the project that I've inherited. So, I've got this:

ModelCategories = request.sa.query(
    Model.category_id
    , Category.name
    , Category.alias).distinct().join(Category).order_by(Category.alias
    , Category.name )

And it generates a rather slow request:

SELECT DISTINCT 
  model.category_id AS model_category_id
  , category.name AS category_name
  , category.alias AS category_alias 
FROM model 
JOIN category ON category.id = model.category_id 
ORDER BY category.alias, category.name

And I need to change it with this:

SELECT 
  model.category_id AS model_category_id
  , category.name AS category_name
  , category.alias AS category_alias 
FROM ( SELECT DISTINCT model_category_id ) as model 
JOIN category ON category.id = model.category_id 
ORDER BY category.alias, category.name

But in terms of SQLAlchemy as is the first request.

First of all check the SQL execution plan. If you have an index on the model.category_id column, the query should not really be slow. Otherwise, following options are available:

Option-1: almost your current solution

# python
ModelCategories = session.query(Category).distinct().join(Model).order_by(Category.alias, Category.name)

# SQL
SELECT DISTINCT category.id AS category_id, category.name AS category_name, category.alias AS category_alias 
FROM category 
JOIN model ON category.id = model.category_id 
ORDER BY category.alias, category.name

This is like your current solution but somewhat cleaner in my view. I assume the performance issue might come from the fact that all table Model is used in the query, and this is also why you need to use distinct.

Option-2: use any() on relationship

# python (assumption: model mapping has relationship defined between Category and Model
mapper(Category, category_table, properties={
    'models': relationship(Model, backref="category") })

# python
ModelCategories = session.query(Category).filter(Category.models.any()).order_by(Category.alias, Category.name)

# SQL
SELECT category.id AS category_id, category.name AS category_name, category.alias AS category_alias 
FROM category 
WHERE EXISTS (SELECT 1 
    FROM model 
    WHERE category.id = model.category_id)
ORDER BY category.alias, category.name

This should boost your performance already. I prefer this to following option-3 as it is again more clean code

Option-3: use subquery

# python
q = select([Model.category_id]).distinct().alias("subq")
ModelCategories = session.query(Category).join(q, Category.id==q.c.category_id)

# SQL
SELECT category.id AS category_id, category.name AS category_name, category.alias AS category_alias 
FROM category
JOIN (SELECT DISTINCT model.category_id AS category_id FROM model) AS subq 
  ON category.id = subq.category_id
ORDER BY category.alias, category.name

This should give you exactly the SQL you asked for. As mentioned, I personally prefer version-2.

继续阅读：sql

converting native SQL to SQLAlchemy

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？