std::deque memory usage - Visual C++, and comparison to others

2023-01-23 06:40 问答作者：

Follow up to What the heque is going on with the memory overhead of std::deque?

Visual C++ manages deque blocks according to the container element type using this:

#define _DEQUESIZ   (sizeof (value_type) <= 1 ? 16 \
    : sizeof (value_type) <= 2 ? 8 \
    : sizeof (value_type) <= 4 ? 4 \
    开发者_JS百科: sizeof (value_type) <= 8 ? 2 \
    : 1)    /* elements per block (a power of 2) */

This results in very large memory footprint for small elements. By changing the 16 in the first line to 128 I was able to drastically reduce the footprint required for a large deque<char>. Process Explorer Private Bytes dropped from 181MB -> 113MB after 100m push_back(const char& mychar) calls).

Can anybody justify the values in that #define?
How do other compilers handle deque block sizing?
What would be their footprint (32-bit operation) for the simple test of 100m push_back calls to deque<char>?
Does STL allow for overriding of this block size at compile-time, without modifying the <deque> code?

gcc has

return __size < 512 ? size_t(512 / __size) : size_t(1);

with a comment

/*  The '512' is
 *  tunable (and no other code needs to change), but no investigation has
 *  been done since inheriting the SGI code.
 */

The Dinkumware (MS) implementation wants to grow the deque by 16-bytes at a time. Could it be that this is just an extremely old implementation (like the first one ever?) that was tuned for platforms with very little memory (by today's standards) to prevent overallocating and exhausting memory (like a std::vector will do)?

I had to implement my own queue in an application I'm working on because the 2.5X memory footprint of std::queue (which uses std::deque) was unacceptable.

There seems to be very little evidence on the interwebs that people have run into this inefficiency, which is surprising to me. I would think such a fundamental data structure as a queue (standard library, no less) would be quite ubiquitous in the wild, and would be in performance/time/space-critical applications. But here we are.

To answer the last question, the C++ standard does not define an interface to modify the block size. I'm pretty sure it doesn't mandate any implementation, just complexity requirements for insertions/removals at both ends.

STLPort

... seems to use:

::: <stl/_alloc.h>
...
enum { _MAX_BYTES = 32 * sizeof(void*) };
...
::: <deque>
...
static size_t _S_buffer_size()
{
  const size_t blocksize = _MAX_BYTES;
  return (sizeof(_Tp) < blocksize ? (blocksize / sizeof(_Tp)) : 1);
}

So that would mean 32 x 4 = 128 bytes block size on 32bit and 32 x 8 = 256 bytes block size on 64 bit.

My thought: From a size overhead POV, I guess it would make sense for any implementation to operate with variable length blocks, but I think this would be extremely hard to get right with the constant time random access requirement of deque.

As for the question

Does STL allow for overriding of this block size at compile-time, without modifying the code?

Not possible here either.

Apache

(seems to be the Rogue Wave STL version) apparently uses:

static size_type _C_bufsize () {
    // deque only uses __rw_new_capacity to retrieve the minimum
    // allocation amount; this may be specialized to provide a
    // customized minimum amount
    typedef deque<_TypeT, _Allocator> _RWDeque;
    return _RWSTD_NEW_CAPACITY (_RWDeque, (const _RWDeque*)0, 0);
}

so there seems to be some mechanism to override the block size via specialization and the definition of ... looks like this:

// returns a suggested new capacity for a container needing more space
template <class _Container>
inline _RWSTD_CONTAINER_SIZE_TYPE
__rw_new_capacity (_RWSTD_CONTAINER_SIZE_TYPE __size, const _Container*)
{
    typedef _RWSTD_CONTAINER_SIZE_TYPE _RWSizeT;

    const _RWSizeT __ratio = _RWSizeT (  (_RWSTD_NEW_CAPACITY_RATIO << 10)
                                       / _RWSTD_RATIO_DIVIDER);

    const _RWSizeT __cap =   (__size >> 10) * __ratio
                           + (((__size & 0x3ff) * __ratio) >> 10);

    return (__size += _RWSTD_MINIMUM_NEW_CAPACITY) > __cap ? __size : __cap;
}

So I'd say it's, aehm, complicated.

(If anyone feels like figuring this out further, feel free to edit my answer directly or just leave a comment.)

继续阅读：deque memory-management

std::deque memory usage - Visual C++, and comparison to others

STLPort

Apache

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

STLPort

Apache

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？