Django optimize queries with for loops
Is it possible to optimize Django to not run so many queries on something like this
for student in Student.objects.all():
for course in student.course_set.all():
for grade in course.grade_set.filter(student=student):
# do stuff
The amount of queries is students * courses * grades which can get huge.
*edit One possibility after getting some ideas from roseman's blog.
for grade in student.grade_set.order_by('course', 'marking_period').select_related():
if grade.marking_period_id in some_report_input:
# do stuff
That's just a snippet but basically I replaced the for loops with just one for loops for the last item I care about (grades) Grades has references to everything I need (student, course, marking period). It was key to use things like marking_period_id instead of grade.marking_period (which does another query).
The trade off is code readability. I wanted to filter out grades and organize based on a criteria. This goes from trivial to convoluted.
This is by no means a generic solution. I'm sure there are times when this won't help at all. Please comment if you know a better way.
Another example:
for student in students:
print student
for department in departments:
print department
failed_grades = Grade.objects.filter(course__department=department,course__courseenrollment__user=student,grade__lte=70)
for failed_grade i开发者_Python百科n failed_grades:
print grade.course
print grade.grade
A student gets enrolled in a course. A course has a department.
It would be helpful if you post your models code and the "do stuff" code you ommit. This way, we could understand how to make an efficient query in your case.
Nevertheless, i think this could be helpful. It covers some cases that select_related
doesn't. Note that prefetch_related
is available since Django 1.4 so you may need to upgrade to this version in order to use it.
It's really important for us to help you that you add your "do stuff" code here, and if it is relevant (and i think it will be) add your models code here (just the fields declarations will be fine). Because the way of getting an optimized query depends on how your models are related and how you are using the queryset results.
EDIT:
in order to optimize the last "for" of your last example, you can do this:
failed_grades = Grade.objects.filter(course__department=department,course__courseenrollment__user=student,grade__lte=70).select_related('course')
for failed_grade in failed_grades:
print grade.course
print grade.grade
In this example, when you do grade.course
, the select_related
part of that query caches all the courses related to the filtered grades, so you can use them making just one query. So, if the __unicode__
method of Course model use only its own fields (i mean, if you don't show any other model data in Course's unicode method) you should get a better performance (less queries) that in your example. I'm not sure how to improve the other for statements as you want. But i think this can help you to get what you want (maybe i'm not understanding your models too much to help you better)
you can use select_related() and it will only hit the database once.
More info in this link (django's documentation) http://docs.djangoproject.com/en/1.2/ref/models/querysets/#select-related
An example of how you could use this in your case
for x in Student.objects.select_related():
do stuff with x.course.grade`
精彩评论