开发者

Can Django do nested queries and exclusions

I need some help putting together this que开发者_如何学JAVAry in Django. I've simplified the example here to just cut right to the point.

MyModel(models.Model):
    created = models.DateTimeField()
    user = models.ForeignKey(User)
    data = models.BooleanField()

The query I'd like to create in English would sound like:

Give me every record that was created yesterday for which data is False where in that same range data never appears as True for the given user

Here's an example input/output in case that wasn't clear.

Table Values

ID   Created    User    Data

1    1/1/2010   admin   False
2    1/1/2010   joe     True
3    1/1/2010   admin   False
4    1/1/2010   joe     False
5    1/2/2010   joe     False

Output Queryset

1    1/1/2010   admin   False
3    1/1/2010   admin   False

What I'm looking to do is to exclude record #4. The reason for this is because in the given range "yesterday", data appears as True once for the user in record #2, therefore that would exclude record #4.

In a sense, it almost seems like there are 2 queries taking place. One to determine the records in the given range, and one to exclude records which intersect with the "True" records.

How can I do this query with the Django ORM?


You don't need a nested query. You can generate a list of bad users' PKs and then exclude records containing those PKs in the next query.

bad = list(set(MyModel.obejcts.filter(data=True).values_list('user', flat=True)))
# list(set(list_object)) will remove duplicates
# not needed but might save the DB some work

rs = MyModel.objects.filter(datequery).exclude(user__pk__in=bad)
# might not need the pk in user__pk__in - try it

You could condense that down into one line but I think that's as neat as you'll get. 2 queries isn't so bad.

Edit: You might wan to read the docs on this:

http://docs.djangoproject.com/en/dev/ref/models/querysets/#in

It makes it sound like it auto-nests the query (so only one query fires in the database) if it's like this:

bad = MyModel.objects.filter(data=True).values('pk')
rs  = MyModel.objects.filter(datequery).exclude(user__pk__in=bad)

But MySQL doesn't optimise this well so my code above (2 full queries) can actually end up running a lot faster.

Try both and race them!


looks like you could use: from django.db.models import F MyModel.objects.filter(datequery).filter(data=False).filter(data = F('data'))

F object available from version 1.0

Please, test it, I'm not sure.


Thanks to lazy evaluation, you can break your query up into a few different variables to make it easier to read. Here is some ./manage.py shell play time in the style that Oli already presented.

> from django.db import connection
> connection.queries = []
> target_day_qs = MyModel.objects.filter(created='2010-1-1')
> bad_users = target_day_qs.filter(data=True).values('user')
> result = target_day_qs.exclude(user__in=bad_users)
> [r.id for r in result]
[1, 3]
> len(connection.queries)
1

You could also say result.select_related() if you wanted to pull in the user objects in the same query.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜