开发者

Retrieving newest object for persons

So, I have a Person model and then I have a model name Carusage. The relevant part of Carusage is this:

class Carusage(models.Model):
    person = models.ForeignKey(Person)
    start = models.DateTimeField()
    end = models.DateTimeField(null=True, blank=True)

A Person can take a car and then the system creates a new Carusage instance and saves it with start as the current time. Then when the Person returns the car, the current time is saved to end.

Now, in my code I have a list of Person models and I want to retrieve the newest date in Carusage for each Person. So if a Person has just returned the car I'd want the end-field of the newest Carusage linked to that person and if the Person still has the car I'd want the start-field.

Preferrably I would like to do this in one SQL-statement as my Person-list can grow quite large (lower-bounds~10, upper-bounds~10.000). I tried something like this:

Carusage.objects.filter(person__in(person_list)).exclude(start__gte(time_now))

And was then thinking of annotating but couldn't think up how I would proceed.

So currently I am doing this:

time_now = datetime.datetime.now()
time_list = []
for p in person_list:
    latest = Carusage.objects.filter(person=p).exclude(start__开发者_如何学编程gte=time_now).only('start', 'end').latest('start')
    try:
        if latest.end<time_now:
            time=latest.end
        else:
            raise
    except:
        time=latest.start
    time_list.append(time)

Obviously my code runs way to slow (about 5 secs for 500 person list). What would be the "django-way" of running this/these queries? Two things I'd like to achieve: Only hit the database once for the Carusage (at least not len(person_list) times) and only get the relevant time from the database (only need the newest time...). Is there any way to achieve this?


You actually have two separate results that you're putting together with an Union.

  1. Cars returned. Both a start and an end time. There are (possibly) many cars per person and you want only one of those cars. Even in pure SQL, this is a rather complex query requiring a HAVING clause and leading to (potentially) slow performance.

  2. Cars not yet returned.

Often, you'll be happier with two separate queries when you have two separate rules.

You're actually doing aggregation of CarUsage grouped by Person having a start (or end) equal to the max of the group.

returned = Person.objects.filter( carusage__end__isnull=True ).annotate(Max('carusage__start'))
not_returned = Person.objects.filter( carusage__end__isnull=False ).annotate(Max('carusage__end'))

I think this is what you're looking for.


By using exceptions that way, you are GUARANTEED to get slow code. Exceptions are horribly expensive and potentially slow down your code a lot (up to 5-10x slower with exceptions). Instead of raising and catching an exception, simply use else:

if latest.end<time_now:
   time=latest.end
else:
   time=latest.start

Also could you simply use Django's order_by filter? For instance to get all "Carusages" for a given person order by start value, invoke:

Carusage.objects.filter(person=p).order_by('-start').all()
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜