开发者

Is there a way to make a select faster by making it read only in django's ORM?

If I am just getting a list of users, and I have no intention of updating this list, do I have the option of marking the query as 'read only' somehow?

Reason being, I know most ORM's keep some sort of change tracking on the rows returned. So if I know before hand that I don't nee开发者_如何学JAVAd to update anything, was curious if I could tell the ORM to mark the result set as read-only.


If I am just getting a list of users, and I have no intention of updating this list, do I have the option of marking the query as 'read only' somehow?

AFAIK there is no way to this. I'd like to know if anyone thinks/knows otherwise.

Reason being, I know most ORM's keep some sort of change tracking on the rows returned. So if I know before hand that I don't need to update anything, was curious if I could tell the ORM to mark the result set as read-only.

May I ask the reason for this requirement? It feels like premature optimization to me. If you do some profiling and find the performance of a particular query is poor and think that it could be improved only by making the queryset read-only, then this question comes into play. Unlikely, IMHO.


You can just hold onto the query set, and as long as it's evaluated it does not have to make the query again. You can even attach this to the request.

Example:

# in the view, a decorator, or middleware
request._my_users = Users.objects.all()
request._my_users[:]

# Later reference request._my_users


The ORM does not need to track the rows fetched, instead it identifies rows by their primary key to determine, whether to insert or to update (if you don't set force_insert or force_update with save()).

Who can read about this here: http://docs.djangoproject.com/en/1.2/ref/models/instances/#how-django-knows-to-update-vs-insert

This said, it is not necessary or even possible to use a model "read only" since it wouldn't yield any performance improvement.

If you want to optimize, there are however some steps you can try (however only small improvements, so you probably should not optimze until it is really necessary).

For example, call querySet.exists() resp. querySet.count() instead of (bool)querySet resp. len(querySet) if (and only if) you are not reading from the query set afterwards. Otherwise, don't use exists()/count(), since it will produce an additional query whereas in the latter cache, the actual reading of the query set is free of cost since it is already cached then.

Another measure is to use only() and defer() to restrict the SELECT to the fields that you actually need and select_related() to pre-fetch foreign key relations, if you know you will need them. If you have larger models with many relations and columns, this can give you a significant performance boost.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜