Most efficient method of logging data to MySQL

2023-02-08 08:31 问答作者：

We have a service which sees several hundred simultaneous connections throughout the day, peeking at about 2000, for about 3 million hits开发者_StackOverflow中文版 a day, and growing. With each request I need to log 4 or 5 pieces of data to MySQL, we originally used the logging that came with the app were using however it was terribly inefficient and would run my db server at >3x the average cpu load, and would eventually bring the server to it knees.

At this point we are going to add our own logging to the application (php), the only option I have for logging data is the MySQL db, as this is the only common resource available to all of the http servers. This data will be mostly writes however everyday we generate reports based on the data, then crunch and archive the old data.

What recommendations can be made to ensure that I don't take down our services with logging data?

The solution we took with this problem was to create an archive table then regularly ( every 15 minutes, on an app server) crunch the data and put it back into the tables that were used to generate reports. The archive table of course did not have any indices, the tables which the reports are generated from have several indices.

Some stats on this approach:

Short Version: >360 times faster

Long Version:

The original code/model did direct inserts into the indexed table, and the average insert took .036 seconds, using the new code/model inserts took less than .0001 seconds (I was not able to get an accurate fix on the insert time I had to measure 100,000 inserts and average for the insert time). The post-processing (crunch) took an average 12 seconds for several tens-of-thousands records. Overall we were greatly pleased with this approach and so far it has worked incredibly well for us.

Based on what you describe, I recommend you try to leverage the fact that you don't need to read this data immediately and pursue a "periodic bulk commit route". That is, buffer the logging data in RAM on the app servers and doing periodic bulk commits. If you have multiple application nodes, some sort of randomized approach would help even more (e.g., commit updated info every 5 +/- 2 minutes).

The main drawback with this approach is that if an app server fails, you lose the buffered data. However, that's only bad if (a) you absolutely need all of the data and (b) your app servers crash regularly. Small chance that both are true, but in the event they are, you can simply persist your buffer to local disk (temporarily) on an app server if that's really a concern.

The main idea is:

buffering the data
periodic bulk commits (leveraging some sort of randomization in a distributed system would help)

Another approach is to stop opening and closing connections if possible (e.g., keep longer lived connections open). While that's likely a good first step, it may require a fair amount of work on your part on a part of the system that you may not have control over. But if you do, it's worth exploring.

继续阅读：logging

Most efficient method of logging data to MySQL

Some stats on this approach:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Some stats on this approach:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？