High-volume logging with batch save to database?

2023-03-20 13:46 问答作者：

I want to store information开发者_C百科 about requests to my sites in a quick way that doesn't put a additional strain on my database. The goal is to use this information to prevent abuse and gather information about how users interact with the site (ip, GET/POST, url/action, timestamp).

I'm currently saving a new row on each page request to the database. However, this is wasting resources with an extra database call when the server is also already logging the same information to the nginx log file.

I want to know what I can do to handle this better, I have two ideas I would like to know if there are any better methods.

CRON job to parse access log each day and save as batch transaction to database.
RAM cache (redis/memcached) to store data about request, then CRON to save to database.

However, I'm not sure how to store the data if I use a key-value cache in a way I can retrieve all the records and insert them in a database.

I also don't know how to parse the access log in a way that I won't re-read the entries.

How can I record access attempts in an efficient way?

Use delayed inserts if you're using MySQL (other engines don't need this)
Beware of indexes making write operations expensive
Rotate tables once every minute/hour/day
Beware of over-normalization and foreign keys

A common pattern is having a simple table for plain writes and moving the logs every minute/hour to a main set of tables. The main set can be highly normalized and indexed and a simple de-normalized table (to save space).

Another pattern is to have a simple big table and run a summary query every minute/hour. The simple table can be indexed by date (remember to use a native type).

A final tip, make the architecture and scripts idempotent (if you run it multiple the data is still valid). It's very common to have blips and a simple re-run of a task for a specific window of minute/hour/day can quickly fix everything instead of a massive rebuild.

继续阅读：access-log logging

High-volume logging with batch save to database?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？