Analytics Dashboard Strategy

2023-04-12 16:43 问答作者：

We are currently developing an API and we want to provide an analytics dashboard for our clients to view metrics about their calls per month/day/hour.

The current strategy we thought is to save each call to a client separate table (eg. calls_{client_id}) for historic reasons and have a summary table (eg. calls_summary) containing the number of calls for a given hour of a day for each client.

Then, each day a cron job will create a xml file with the summary of last day's calls for each client and the dashboard will use them instead of the database. So, the only analytics task which will use the database will be the cr开发者_JS百科on job.

For infrastructure we are thinking MySQL replication and the slave as the analytics database.

Is that strategy useful and valid for real web statistics? Can you propose any tuning on that or even a totally different one?

save each call to a client separate table (eg. calls_{client_id}) for historic reasons

No. Don't break the rules of normalization unless you've got good reason. It won't improve performance and could actually be very detrimental. It'll certainly make your code more complex and therefore less reliable.

It might be worth archiving off old records on period-by-period basis, but unless you know that you'rew going to run into performance problems I'd advise against this.

By all means pre-consolidate the data into another table (provided you are getting a reduction ni number of rows of at least 95%). But don't bother converting it to XML unless and until you need the data in that format.

As for how you pre-consolidate....either use period based consolidations (e.g. roll up by date) or use flagging to record which records have already been consolidated.

The less frequently you run the consolidation the more the impact on performance. But run it too frequently and you'll have problems with contention / locking.

Without knowing a lot about the structure and volume of data or the constraints in terms of budget, availability and timeliness it's hard to provide an optimal solution. But if were me, I'd probably go with 3 mysqld tiers - one providing the transactional write facility, one replicating this data and generating the consolidated data, and one to provide read access to the consolidated data (master <-> master <-> slave)

Performance wise, it is a bad idea to create a separate table for each client. The classic approach for that would be the folowing:

client: id, name, address, ...
call: id, client_id, created_at, duration, ...
calls_summary: id, client_id, date_start, date_end, nb_calls

Now if you want to retrieve all the calls of a client, you go like this:

SELECT * FROM client
LEFT JOIN call ON call.client_id = client.id
WHERE client.id = 42

Or :

SELECT * FROM call where client_id = 42

I don't see any reason for using xml, your cron could be just updating the calls_summary table.

继续阅读：analytics php

Analytics Dashboard Strategy

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？