The memory usage reported by guppy differ from ps command

2022-12-16 22:48 问答作者：

I am profiling my twisted server. It uses much more memory than I expected. Its memory usage grows over time.

 ps -o pid,rss,vsz,sz,size,command
  PID   RSS    VSZ    SZ    SZ COMMAND
 7697 70856 102176 25544 88320 twistd -y broadcast.tac

As you can see it costs 102176 KBs, namely, 99.78125 MBs. And I use guppy from a twisted manhole to watch the memory usage profile.

>>> hp.heap()
Partition of a set of 120537 objects. Total size = 10096636 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  61145  51  5309736  53   5309736  53 str
     1  27139  23  1031596  10   6341332  63 tuple
     2   2138   2   541328   5   6882660  68 dict (no owner)
     3   7190   6   488920   5   7371580  73 types.CodeType
     4    325   0   436264   4   7807844  77 dict of module
     5   7272   6   407232   4   8215076  81 function
     6    574   0   305776   3   8520852  84 dict of class
     7    605   1   263432   3   8784284  87 type
     8    602   0   237200   2   9021484  89 dict of type
     9    303   0   157560   2   9179044  91 dict of zope.interface.interface.Method
<384 more rows. Type e.g. '_.more' to view.>

Hum... It seems there is something wrong. Guppy shows that the total usage of memory is 10096636 bytes, namely 9859.996 KBs or 9.628 MBs.

That's a huge difference. What's wrong this strange result? What am I doing wrong?

Update: I wrote a monitor script last night. It records the memory usage and number of on-line users. It is a radio server, so you can see there is radios and total listeners. Here is the figure I generated by matplotlib.

The memory usage reported by guppy differ from ps command

Something is strange. Sometimes the memory usage printed by ps is very low开发者_JAVA技巧, like this

2010-01-15 00:46:05,139 INFO 4 4 17904 36732 9183 25944
2010-01-15 00:47:03,967 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:48:04,373 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:49:04,379 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:50:02,989 INFO 4 4 3700 5256 1314 2260

What is the reason of the super low value of memory usage? And what's more, even there is no on-line radios, no listeners, the memory usage is still high.

possibly due to swapping/memory reservation, based on ps's definition:

RSS: resident set size, the non-swapped physical memory
     that a task has used (in kiloBytes).

VSZ: virtual memory usage of entire process.
     vm_lib + vm_exe + vm_data + vm_stack

it can be a bit confusing, 4 different size metrics can be seen with:

# ps -eo pid,vsz,rss,sz,size,cmd|egrep python

PID    VSZ   RSS   SZ    SZ    CMD
23801  4920  2896  1230  1100  python

the virtual size includes memory that was reserved by the process and not used, the size of all shared libraries that were loaded, pages that are swapped out, and blocks that were already freed by your process, so it could be much larger than the size of all live objects in python.

some additional tools to investigate memory performance:

Heapy (part of Guppy, which you are using): http://guppy-pe.sourceforge.net/
Python Memory Validator http://www.softwareverify.com/python/memory/index.html
PySizer http://pysizer.8325.org/

good guide on tracking down memory leaks in python using pdb and objgraph:

http://www.lshift.net/blog/2008/11/14/tracing-python-memory-leaks

As pointed out above the RSS size is what you're most interested in here. The "Virtual" size includes mapped libraries, which you probably don't want to count.

It's been a while since I used heapy, but I am pretty sure the statistics it prints do not include overhead added by heapy itself. This overhead can be pretty significant (I've seen a 100MB RSS process grow another dozen or so MB, see http://www.pkgcore.org/trac/pkgcore/doc/dev-notes/heapy.rst ).

But in your case I suspect the problem is that you are using some C library that either leaks or uses memory in a way that heapy does not track. Heapy is aware of memory used directly by python objects, but if those objects wrap C objects that are separately allocated heapy is not normally aware of that memory at all. You may be able to add heapy support to your bindings (but if you do not control the bindings you use that is obviously a hassle, and even if you do control the bindings you may not be able to do this depending on what you are wrapping).

If there are leaks at the C level heapy will also lose track of that memory (RSS size will go up but heapy's reported size will stay the same). Valgrind is probably your best bet to track these down, just as it is in other C applications.

Finally: memory fragmentation will often cause your memory usage (as seen in top) to go up but not down (much). This is usually not that much of a problem with daemons, since the process will reuse this memory, it's just not released back to the os, so the values in top do not go back down. If memory usage (as seen by top) goes up more or less linearly with the number of users (connections), does not go back down, but also does not keep growing forever until you hit a new maximum number of users, fragmentation is probably to blame.

This isn't a complete answer, but from your manhole, I'd also suggest manually running gc.collect() prior to looking with ps or top. guppy will show the allocated heap, but doesn't do anything to proactively free objects that are no longer allocated.

继续阅读：guppy memory-management python twisted

The memory usage reported by guppy differ from ps command

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？