Querying data from Django
Here's what my model structure looks like:
开发者_开发问答class Visitor(models.Model):
id = models.AutoField(primary_key=True)
class Session(models.Model):
id = models.AutoField(primary_key=True)
visit = models.ForeignKey(Visitor)
sequence_no = models.IntegerField(null=False)
class Track(models.Model):
id = models.AutoField(primary_key=True)
session = models.ForeignKey(Session)
action = models.ForeignKey(Action)
when = models.DateTimeField(null=False, auto_now_add=True)
sequence_no = models.IntegerField(null = False)
class Action(models.Model):
id = models.AutoField(primary_key=True)
url = models.CharField(max_length=65535, null=False)
host = models.IntegerField(null=False)
As you can see, each Visitor
has multiple Sessions
; each Session
has multiple Tracks
and each Track
has one Action
. Tracks are always ordered ascendingly by the session
and the sequence_no
. A Visitors
average time on an site (i.e. a particular Action.host
) is the difference in Track.when
(time) between the highest and lowest Track.sequence_no
divided by the number of Sessions
of that Visitor
.
I need to calculate the average time of visitors on the site which be the sum of the time for each visitor on the Action.site
divided by the number of visitors.
I could query this using SQL but I'd like to keep my query as Djangonic as possible and I'm still very lost with complex queries.
For a specific Action
object you can gather interesting data about Sessions:
from django.db.models import Min, Max
from yourapp.models import *
host = 1 # I suppose you want to calculate for each site
sessions = list(Session.objects.filter(
track__action__host=host,
).annotate(
start=Min('track__when'),
end=Max('track__when'),
).values('visit_id', 'start', 'end'))
You will get something in the line of:
[
{ 'visit_id': 1, 'start': datetime(...), 'end': datetime(...) },
{ 'visit_id': 1, 'start': datetime(...), 'end': datetime(...) },
{ 'visit_id': 2, 'start': datetime(...), 'end': datetime(...) },
....
]
Now it's only a matter of getting the desired result from the data:
number_of_visitors = len(set(s['visit_id'] for s in sessions))
total_time = sum((s['end'] - s['start']).total_seconds() for s in sessions)
average_time_spent = total_time / number_of_visitors
Another way is to use two queries instead of one, and avoid the len(set(...))
snippet:
sessions = Session.objects.filter(
track__action__host=host,
).annotate(
start=Min('track__when'),
end=Max('track__when'),
)
number_of_visitors = sessions.values('visit_id').distict().count()
total_time = sum((s['end'] - s['start']).total_seconds()
for s in sessions.values('start', 'end'))
There is NO WAY to do actual calculated fields barring the provided aggregations, so either you do it in raw SQL or you do in code like this.
At least the proposed solution uses Django's ORM as far as possible.
精彩评论