Ravendb mapreduce grouping by multiple fields
We have a site that contains streaming video and we want to display three reports of most watched videos in the last week, month and year (a rolling window).
We store a document in ravendb each time a video is watched:
public class ViewedContent
{
public开发者_StackOverflow社区 string Id { get; set; }
public int ProductId { get; set; }
public DateTime DateViewed { get; set; }
}
We're having trouble figuring out how to define the indexes / mapreduces that would best support generating those three reports.
We have tried the following map / reduce.
public class ViewedContentResult
{
public int ProductId { get; set; }
public DateTime DateViewed { get; set; }
public int Count { get; set; }
}
public class ViewedContentIndex :
AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
Map = docs => from doc in docs
select new
{
doc.ProductId,
DateViewed = doc.DateViewed.Date,
Count = 1
};
Reduce = results => from result in results
group result by result.DateViewed
into agg
select new
{
ProductId = agg.Key,
Count = agg.Sum(x => x.Count)
};
}
}
But, this query throws an error:
var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
.Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) );
Error: "DateViewed is not indexed"
Ultimately, we want to query something like:
var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
.Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) )
.GroupBy( x => x.ProductId )
.OrderBy( x => x.Count )
This doesn't actually compile, because the OrderBy is wrong; Count is not a valid property here.
Any help here would be appreciated.
Each report is a different GROUP BY if you're in SQL land, that tells you that you need three indexes - one with just the month, one with entries by week, one by month, and one by year (or maybe slightly different depending on how you're actually going to do the query.
Now, you have a DateTime there - that presents some problems - what you actually want to do is index the Year component of the DateTime, the Month component of the date time and Day component of that date time. (Or just one or two of these depending on which report you want to generate.
I'm only para-quoting your code here so obviously it won't compile, but:
public class ViewedContentIndex :
AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
Map = docs => from doc in docs
select new
{
doc.ProductId,
Day = doc.DateViewed.Day,
Month = doc.DateViewed.Month,
Year = doc.DateViewed.Year
Count = 1
};
Reduce = results => from result in results
group result by new {
doc.ProductId,
doc.DateViewed.Day,
doc.DateViewed.Month,
doc.DateViewed.Year
}
into agg
select new
{
ProductId = agg.Key.ProductId,
Day = agg.Key.Day,
Month = agg.Key.Month,
Year = agg.Key.Year
Count = agg.Sum(x => x.Count)
};
}
}
Hopefully you can see what I'm trying to achieve by this - you want ALL the components in your group by, as they are what make your grouping unique.
I can't remember if RavenDB lets you do this with DateTimes and I haven't got it on this computer so can't verify this, but the theory remains the same.
So, to re-iterate
You want an index for your report by week + product id You want an index for your report by month + product id You want an index for your report by year + product id
I hope this helps, sorry I can't give you a compilable example, lack of raven makes it a bit difficult :-)
精彩评论