What is the most efficient way to keep track of all user traffic inside a database

2023-01-02 17:26 问答作者：

Currently I am using mysql to log all traffic from all users coming into a website that I manage. The database has grown to al开发者_JAVA百科most 11m rows in a month, and queries are getting quite slow. Is there a more efficient way to log user information? All we are storing is their request, useragent, and their ip, and associating it with a certain website.

Why not try Google Analytics? Even if you might not think it would be sufficient for you, I bet you it can track 99% of what you want to be tracked.

The answer depends completely on what you expect to retrieve in the query side. Are you looking for aggregate information, are you looking for all of history or only a portion? Often, if you need to look at every row to find out what you need, storing in basic text files is quickest.

What are the kind of queries that you want to run on the data? I assume most of your queries are over data in current or recent time window. I would suggest to use time based partitioning of the table. This will make such queries faster as the queries will hit only the partition having the data, so less disk seeks. Also regularly purge old data and put them in summary tables. Some useful links are:

http://forge.mysql.com/w/images/a/a2/FOSDEM_2009-Giuseppe_Maxia-Partitions_Performance.pdf
http://www.slideshare.net/bluesmoon/scaling-mysql-writes-through-partitioning-3397422

the most efficient way is probably to have apache (assuming thats what the site is running on) simply use its built in logging to text logs, and configure something like AWStats. This removes the need to log this information yourself, and should provide you with the information you are looking for - probably all ready configured in existing reports. The benefit of this over something like Google Analytics would be its server side tracking - etc.

Maybe stating the obvious but have you got a good index in relation to the querys that you are making?

1) Look at using Piwik to perform Google Analytic type tracking, while retaining control of the MySQL data.

2) If you must continue to use your own system, look at using InnoDB Plugin in order to support compressed table types. In addition, convert IP to unsigned integer, convert both useragent and request to unsigned int referencing lookup tables that are compressed using either Innodb compression or the archive engine.

3) Skip partitioning and shard the DB by month.

This is what "Data Warehousing" is for. Consider buying a good book on warehousing.

Collect the raw data in some "current activity" schema.

Periodically, move it into a "warehouse" (or "datamart") star schema that's (a) separate from the current activity schema and (b) optimized for count/sum/group-by queries.

Move, BTW, means insert into warehouse schema and delete from current activity schema.

Separate your ongoing transactional processing from your query/analytical processing.

继续阅读：logging performance

What is the most efficient way to keep track of all user traffic inside a database

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？