开发者

Django optimize queries with for loops

Is it possible to optimize Django to not run so many queries on something like this

for student in Student.objects.all():
    for course in student.course_set.all():
        for grade in course.grade_set.filter(student=student):
            # do stuff

The amount of queries is students * courses * grades which can get huge.

*edit One possibility after getting some ideas from roseman's blog.

for grade in student.grade_set.order_by('course', 'marking_period').select_related():
    if grade.marking_period_id in some_report_input:
        # do stuff

That's just a snippet but basically I replaced the for loops with just one for loops for the last item I care about (grades) Grades has references to everything I need (student, course, marking period). It was key to use things like marking_period_id instead of grade.marking_period (which does another query).

The trade off is code readability. I wanted to filter out grades and organize based on a criteria. This goes from trivial to convoluted.

This is by no means a generic solution. I'm sure there are times when this won't help at all. Please comment if you know a better way.

Another example:

for student in students:
  print student
  for department in departments:
    print department
    failed_grades = Grade.objects.filter(course__department=department,course__courseenrollment__user=student,grade__lte=70)
      for failed_grade i开发者_Python百科n failed_grades:
        print grade.course
        print grade.grade

A student gets enrolled in a course. A course has a department.


It would be helpful if you post your models code and the "do stuff" code you ommit. This way, we could understand how to make an efficient query in your case.

Nevertheless, i think this could be helpful. It covers some cases that select_related doesn't. Note that prefetch_related is available since Django 1.4 so you may need to upgrade to this version in order to use it.

It's really important for us to help you that you add your "do stuff" code here, and if it is relevant (and i think it will be) add your models code here (just the fields declarations will be fine). Because the way of getting an optimized query depends on how your models are related and how you are using the queryset results.

EDIT:

in order to optimize the last "for" of your last example, you can do this:

failed_grades = Grade.objects.filter(course__department=department,course__courseenrollment__user=student,grade__lte=70).select_related('course')
for failed_grade in failed_grades:
    print grade.course
    print grade.grade

In this example, when you do grade.course, the select_related part of that query caches all the courses related to the filtered grades, so you can use them making just one query. So, if the __unicode__ method of Course model use only its own fields (i mean, if you don't show any other model data in Course's unicode method) you should get a better performance (less queries) that in your example. I'm not sure how to improve the other for statements as you want. But i think this can help you to get what you want (maybe i'm not understanding your models too much to help you better)


you can use select_related() and it will only hit the database once.

More info in this link (django's documentation) http://docs.djangoproject.com/en/1.2/ref/models/querysets/#select-related

An example of how you could use this in your case

for x in Student.objects.select_related():
    do stuff with x.course.grade`
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜