开发者

Improving Django performance with 350000+ regs and complex query

I have a model like this:

class Stock(models.Model):
    product = models.ForeignKey(Product)
    place = models.ForeignKey(Place)
    date = models.DateField()
    quantity = models.IntegerField()

I need to get the latest (by date) quantity for every product for every place, with almost 500 products, 100 places and 350000 stock records on the database.

My current code is like this, it worked on testing but it takes so long with the real data that it's useless

    stocks = Stock.objects.filter(product__in=self.products,
                                  place__in=self.places, date__lt=date_at)
    stock_values = {}
    for prod in self.products:
        for place in self.places:
            key = u'%s%s' % (prod.id, place.id)
            stock = stocks.filter(product=prod, place=place, date=date_at)
            if len(stock) > 0:
                stock_values[key] = stock[0].quantity
            else:
                try:
                    stock = stocks.filter(product=prod, place=place).order_by('-date')[0]
                except IndexError:
                    stock_values[key] = 0
                else:
                    stock_values[key] = stock.quantity
    return stock_values

How would you make it faster?

Edit: Rewrote the code as this:

    stock_values = {}
    for product in self.products:
        for place in self.places:
            try:
                stock_value = Stock.objects.filter(product=product, place=place, date__lte=date_at)\
                                           .order_by('-date').values('cant')[0]['cant']
            except IndexError:
                stock_value = 0
            stock_values[u'%s%s' % (product.id, place.id)] = stock_value
    return stock_values

It works better (from 256 secs to 64) but still need开发者_如何学JAVA to improve it. Maybe some custom SQL, I don't know...


Arthur's right, the len(stock) isn't the most efficient way to do that. You could go further along the "easier to ask for forgiveness than permission" route with something like this inside the inner loop:

key = u'%s%s' % (prod.id, place.id)
try:
    stock = stocks.filter(product=prod, place=place, date=date_at)[0]
    quantity = stock.quantity
except IndexError:
    try:
        stock = stocks.filter(product=prod, place=place).order_by('-date')[0]
        quantity = stock.quantity
    except IndexError:
        quantity = 0
stock_values[key] = quantity

I'm not sure how much that would improve it compared to just changing the length check, though I think this should at least restrict it to two queries with LIMIT 1 on them (see Limiting QuerySets).

Mind you, this is still performing a lot of database hits since you could run through that loop almost 50000 times. Optimize how you're looping and you're in a better position still.


maybe the trick is in that len() method!

follow docs from:

Note: Don't use len() on QuerySets if all you want to do is determine the number of records in the set. It's much more efficient to handle a count at the database level, using SQL's SELECT COUNT(*), and Django provides a count() method for precisely this reason. See count() below.

So try changing the len to count(), and see if it makes faster!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜