converting native SQL to SQLAlchemy
I've got the following problen: I know SQL and I don't know how to work with SQLAlchemy but I ne开发者_开发知识库ed to change it in 1 place in the project that I've inherited. So, I've got this:
ModelCategories = request.sa.query(
Model.category_id
, Category.name
, Category.alias).distinct().join(Category).order_by(Category.alias
, Category.name )
And it generates a rather slow request:
SELECT DISTINCT
model.category_id AS model_category_id
, category.name AS category_name
, category.alias AS category_alias
FROM model
JOIN category ON category.id = model.category_id
ORDER BY category.alias, category.name
And I need to change it with this:
SELECT
model.category_id AS model_category_id
, category.name AS category_name
, category.alias AS category_alias
FROM ( SELECT DISTINCT model_category_id ) as model
JOIN category ON category.id = model.category_id
ORDER BY category.alias, category.name
But in terms of SQLAlchemy as is the first request.
First of all check the SQL execution plan. If you have an index
on the model.category_id
column, the query should not really be slow.
Otherwise, following options are available:
Option-1: almost your current solution
# python
ModelCategories = session.query(Category).distinct().join(Model).order_by(Category.alias, Category.name)
# SQL
SELECT DISTINCT category.id AS category_id, category.name AS category_name, category.alias AS category_alias
FROM category
JOIN model ON category.id = model.category_id
ORDER BY category.alias, category.name
This is like your current solution but somewhat cleaner in my view. I assume the performance issue might come from the fact that all table Model
is used in the query, and this is also why you need to use distinct
.
Option-2: use any() on relationship
# python (assumption: model mapping has relationship defined between Category and Model
mapper(Category, category_table, properties={
'models': relationship(Model, backref="category") })
# python
ModelCategories = session.query(Category).filter(Category.models.any()).order_by(Category.alias, Category.name)
# SQL
SELECT category.id AS category_id, category.name AS category_name, category.alias AS category_alias
FROM category
WHERE EXISTS (SELECT 1
FROM model
WHERE category.id = model.category_id)
ORDER BY category.alias, category.name
This should boost your performance already. I prefer this to following option-3 as it is again more clean code
Option-3: use subquery
# python
q = select([Model.category_id]).distinct().alias("subq")
ModelCategories = session.query(Category).join(q, Category.id==q.c.category_id)
# SQL
SELECT category.id AS category_id, category.name AS category_name, category.alias AS category_alias
FROM category
JOIN (SELECT DISTINCT model.category_id AS category_id FROM model) AS subq
ON category.id = subq.category_id
ORDER BY category.alias, category.name
This should give you exactly the SQL you asked for. As mentioned, I personally prefer version-2.
精彩评论