Python multi-threaded application with memory leak from the thread-specific logger instances

2023-03-13 21:14 问答作者：

I have a server subclass spawning threaded response handlers, the handlers in turn start application threads. Everything is going smoothly except when I use ObjGraph I see the correct number of application threads running ( I am load testing and have it throttled to keep 35 applications instances running).

Invoking objgraph.typestats() provides a break down of how many instances of each object are currently live in the interpreter (according to the GC). Looking at that output for memory leaks I find 700 logger instances - which would be the total number of response handlers spawned by the server.

I have called logger.removehandler(memoryhandler) and logger.removehandler(filehandler) when the application thread exits the run() method to ensure that there are no lingering references to the logger instances, also the logge开发者_JAVA技巧r instances is completely isolated within the application thread (there are no external references to it). As a final stab at eliminating these logger instances the last statement in run() is del self.logger

To get the logger in init() I provide it a suitably large random number to name it so it will be distinct for file access - I use the same large number as part of the log file name to avoid application log collisions.

The long and the short is I have 700 logger instances tracked by the GC but only 35 active threads - how do I go about killing off these loggers? A more cumbersome engineer solution is to create a pool of loggers and just acquire one for the life of the application thread but that is creating more code to maintain when the GC should simply handle this automatically.

Don't create potentially unbounded numbers of loggers, that's not good practice - there are other ways of getting context-sensitive information into your logs, as documented here.

You also don't need to have a logger as an instance attribute: loggers are singletons so you can just get a particular one by name from anywhere. The recommended practice is to name loggers at module level using

logger = logging.getLogger(__name__)

which suffices for most scenarios.

From your question I can't tell whether you appreciate that handlers and loggers aren't the same thing - for example you talk about removeHandler calls (which might serve to free the handler instances because their reference counts go to zero, but you won't free any logger instances by doing so).

Generally, loggers are named after parts of your application which generate events of interest.

If you want each thread to e.g. write to a different file, you can create a new filename each time, and then close the handler when you're done and the thread is about to terminate (that closing is important to free handler resources). Or, you can log everything to one file with thread ids or other discriminators included in the log output, and use post-processing on the log file.

I met the same memory leak when using logging.Logger(), and you may try to manually close the handler fd when the logger is useless, like:

for handler in logger.handlers:
    handler.close()

继续阅读：logging memory-leaks multithreading objgraph python

Python multi-threaded application with memory leak from the thread-specific logger instances

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？