how to denormalize ranges and store them in a data warehouse?
My custom web servers report session information 开发者_如何学编程in the form of a wall clock timestamp (start of session) and the number of wall click seconds the session lasted.
I want to store this information in a data warehouse (MySQL + start schema) in a way that will allow me to for example query the number of sessions for a particular entity at a specific time. A requirement is that we must be able to extract time series data that will be feed to a graph.
Is there any other way to store the data than to insert a row for every absolute second of the session?
I would say that the simplest way is to store one row per session with StartTime, EndTime in the fact table.
Let's say we have factSession
with:
(
...
,SessionID int
,StartTime datetime
,EndTime datetime
,EntityID int
...
) ;
and
TimeSeries TABLE ( TimePoint datetime ) ;
You can:
SELECT t.TimePoint
,f.EntityID
,COUNT(f.SessionID) AS cnt
FROM TimeSeries AS t
LEFT JOIN factSessions AS f ON ( f.StartTime <= t.TimePoint
AND t.TimePoint <= f.EndTime
)
GROUP BY t.TimePoint ,f.EntityID
ORDER BY t.TimePoint
精彩评论