开发者

Most efficient Django query to get the objects and their related

I have this model:

class Institution(models.Model):
    name = models.CharField(max_length=128, db_index=True)
    aliases = models.ManyToManyField('self', blank=True)

I would like to make the most efficient query that return all Institution where name contains the search term AND their aliases Institution. I came with the solution below that work but I was wondering if there's a simpler/more efficient way to achieve this?

base_query = Institution.objects.filter(name__icontains='term')
pk_query = Q(pk__in=base_query)
aliases_query = Q(aliases__in=base_query)
final_query = Institution.objects.filter(pk_query|aliases_query).distinct()

Here is the SQL of this query:

SELECT DISTINCT `app_institution`.`id`, `app_institution`.`name`
FROM `app_institution` LEFT OUTER JOIN `app_institution_aliases`
ON (`app_institution`.`id` = `app_institution_aliases`.`from_institution_id`)
WHERE (`app_institution`.`id`
IN (SELECT U0.`id` FROM `app_institution` U0 WHERE U0.`name` LIKE %term% )
OR `app_institution_aliases`.`to_institution_id`
IN (SELECT U0.`id` FROM `app_institution` U0 WHERE U0.`name` LIKE %term% ))
ORDER BY `app_institution`.`name` ASC LIMIT 21

UPDATE

By looking at the 2 first answers I got,开发者_开发技巧 I think I should specify more clearly what I want as results.

I want the UNION of

  • the results of the base_query (Institution where name contains the search term)

WITH

  • aliases of each of the Institution return by the base_query (theses aliases' name don't need to contains the search term).

Done in an inefficient (but easily understandable) way will be like that:

base_query = Institution.objects.filter(name__icontains='term')
results= set(base_query)
for institution in base_query:
    results.update(institution.aliases.all())

2nd UPDATE

Thinking about S.Lott answer, I finally figure out a way to do it with two queries that I join together after.

base_query = Institution.objects.filter(name__icontains='term')
results= set(base_query)
aliases_query = Institution.objects.filter(aliases__in=base_query)
results.update(aliases_query)

I did some small benchmarks and this solution take around half time of the one with the one big query.

But something that I forgot to take into account is the impact on the ordering...


Union queries -- like this -- are difficult.

It's important to review the use cases to be sure you really need to conflate two separate collections (by name and by alias) like this. Sometimes the web page can be presented with two collections, removing the need for a union.

Using Q objects to build "or" conditions is one way to create a union.

Assembling a separate collection from the two queries is another solution.

name_query = Institution.objects.filter(name__icontains='term')
aliases_query = Institution.objects.filter(aliases__name__icontains='term')
final_query = list(name_query) + list(aliases_query)

The only way to know which is better is to benchmark the alternatives. The "complexity" of the SQL query text doesn't really mean much, because there are so many optimization steps inside an RDBMS.


this:

Institution.objects.filter(name__icontains='term', aliases__name__icontains='terms')
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜