Most efficient Django query to get the objects and their related
I have this model:
class Institution(models.Model):
name = models.CharField(max_length=128, db_index=True)
aliases = models.ManyToManyField('self', blank=True)
I would like to make the most efficient query that return all Institution
where name
contains the search term AND their aliases
Institution
. I came with the solution below that work but I was wondering if there's a simpler/more efficient way to achieve this?
base_query = Institution.objects.filter(name__icontains='term')
pk_query = Q(pk__in=base_query)
aliases_query = Q(aliases__in=base_query)
final_query = Institution.objects.filter(pk_query|aliases_query).distinct()
Here is the SQL of this query:
SELECT DISTINCT `app_institution`.`id`, `app_institution`.`name`
FROM `app_institution` LEFT OUTER JOIN `app_institution_aliases`
ON (`app_institution`.`id` = `app_institution_aliases`.`from_institution_id`)
WHERE (`app_institution`.`id`
IN (SELECT U0.`id` FROM `app_institution` U0 WHERE U0.`name` LIKE %term% )
OR `app_institution_aliases`.`to_institution_id`
IN (SELECT U0.`id` FROM `app_institution` U0 WHERE U0.`name` LIKE %term% ))
ORDER BY `app_institution`.`name` ASC LIMIT 21
UPDATE
By looking at the 2 first answers I got,开发者_开发技巧 I think I should specify more clearly what I want as results.
I want the UNION of
- the results of the
base_query
(Institution
wherename
contains the search term)
WITH
aliases
of each of theInstitution
return by thebase_query
(thesesaliases
'name
don't need to contains the search term).
Done in an inefficient (but easily understandable) way will be like that:
base_query = Institution.objects.filter(name__icontains='term')
results= set(base_query)
for institution in base_query:
results.update(institution.aliases.all())
2nd UPDATE
Thinking about S.Lott answer, I finally figure out a way to do it with two queries that I join together after.
base_query = Institution.objects.filter(name__icontains='term')
results= set(base_query)
aliases_query = Institution.objects.filter(aliases__in=base_query)
results.update(aliases_query)
I did some small benchmarks and this solution take around half time of the one with the one big query.
But something that I forgot to take into account is the impact on the ordering...
Union queries -- like this -- are difficult.
It's important to review the use cases to be sure you really need to conflate two separate collections (by name and by alias) like this. Sometimes the web page can be presented with two collections, removing the need for a union.
Using Q
objects to build "or" conditions is one way to create a union.
Assembling a separate collection from the two queries is another solution.
name_query = Institution.objects.filter(name__icontains='term')
aliases_query = Institution.objects.filter(aliases__name__icontains='term')
final_query = list(name_query) + list(aliases_query)
The only way to know which is better is to benchmark the alternatives. The "complexity" of the SQL query text doesn't really mean much, because there are so many optimization steps inside an RDBMS.
this:
Institution.objects.filter(name__icontains='term', aliases__name__icontains='terms')
精彩评论