开发者

Is there *any* way to get the length of a C-style array in C++/G++?

I've been trying to implement a lengthof (T* v) function for quite a while, so far without any success.

There are the two basic, well-known solutions for T v[n] arrays, both of which are useless or even dangerous once the array has been decayed into a T* v pointer.

#define SIZE(v) (sizeof(v) / sizeof(v[0]))

template <class T, size_t n>
size_t lengthof (T (&) [n])
{
    return n;
}

There are workarounds involving wrapper classes and containers like STLSoft's array_proxy, boost::array, std::vector, etc. All of them have drawbacks, and lack the simplicity, syntactic sugar and widespread usage of arrays.

There are myths about solutions involving compiler-specific calls that are normally used by the compiler when delete [] needs to know the length of the array. According to the C++ FAQ Lite 16.14, there are two techniques used by compilers to know how much memory to deallocate: over-allocation and associative arrays. At over-allocation it allocates one wordsize more, and puts the length of the array before the first object. The other method obviously stores the lengths in an associative array. Is it possible to know which method G++ uses, and to extract the appropriate array length? What about overheads and paddings? Any hope for non-compiler-specific code? Or even non-platform-specific G++ builtins?

There are also solutions involving overloading operator new [] and operator delete [], which I implemented:

std::map<void*, size_t> arrayLengthMap;

inline void* operator new [] (size_t n)
throw (std::bad_alloc)
{
    void* ptr = GC_malloc(n);
    arrayLengthMap[ptr] = n;
    return ptr;
}

inline void operator delete [] (void* ptr)
throw ()
{
    arrayLengthMap.erase(ptr);
    GC_free(ptr);
}

template <class T>
inline size_t lengthof (T* ptr)
{
    std::map<void*, size_t>::const_iterator it = arrayLengthMap.find(ptr);
    if( it == arrayLengthMap.end() ){
        throw std::bad_alloc();
    }
    return it->second / sizeof(T);
}

It was working nicely until I got a strange error: lengthof couldn't find an array. As it turned out, G++ allocated 8 more bytes at the start of this specific array than it should have. Though operator new [] should have returned the start of the entire array, call it ptr, the calling code got ptr+8 instead, so lengthof(ptr+8) obviously failed with the exception (even if it did not, it could have potentially returned a wrong array size). Ar开发者_开发百科e those 8 bytes some kind of overhead or padding? Can not be the previously mentioned over-allocation, the function worked correctly for many arrays. What is it and how to disable or work around it, assuming it is possible to use G++ specific calls or trickery?

Edit: Due to the numerous ways it is possible to allocate C-style arrays, it is not generally possible to tell the length of an arbitrary array by its pointer, just as Oli Charlesworth suggested. But it is possible for non-decayed static arrays (see the template function above), and arrays allocated with a custom operator new [] (size_t, size_t), based on an idea by Ben Voigt:

#include <gc/gc.h>
#include <gc/gc_cpp.h>
#include <iostream>
#include <map>

typedef std::map<void*, std::pair<size_t, size_t> > ArrayLengthMap;
ArrayLengthMap arrayLengthMap;

inline void* operator new [] (size_t size, size_t count)
throw (std::bad_alloc)
{
    void* ptr = GC_malloc(size);
    arrayLengthMap[ptr] = std::pair<size_t, size_t>(size, count);
    return ptr;
}

inline void operator delete [] (void* ptr)
throw ()
{
    ArrayLengthMap::const_iterator it = arrayLengthMap.upper_bound(ptr);
    it--;
    if( it->first <= ptr and ptr < it->first + it->second.first ){
        arrayLengthMap.erase(it->first);
    }
    GC_free(ptr);
}

inline size_t lengthof (void* ptr)
{
    ArrayLengthMap::const_iterator it = arrayLengthMap.upper_bound(ptr);
    it--;
    if( it->first <= ptr and ptr < it->first + it->second.first ){
        return it->second.second;
    }
    throw std::bad_alloc();
}

int main (int argc, char* argv[])
{
    int* v = new (112) int[112];
    std::cout << lengthof(v) << std::endl;
}

Unfortunately due to arbitrary overheads and paddings by the compiler, there is no reliable way so far to determine the length of a dynamic array in a custom operator new [] (size_t), unless we assume that the padding is smaller than the size of one of the elements of the array.

However there are other kinds of arrays as well for which length calculation might be possible, as Ben Voigt suggested, thus it should be possible and desirable to construct a wrapper class that can accept several kinds of arrays (and their lengths) in its constructors, and is implicitly or explicitly convertible to other wrapper classes and array types. Different lifetimes of different kinds of arrays might be a problem, but it could be solved with garbage collection.


To answer this:

Any hope for non-compiler-specific code?

No.

More generally, if you find yourself needing to do this, then you probably need to reconsider your design. Use a std::vector, for instance.


Your analysis is mostly correct, however I think you've ignored the fact that types with trivial destructors don't need to store the length, and so overallocation can be different for different types.

The standard allows operator new[] to steal a few bytes for its own use, so you'll have to do a range check on the pointer instead of an exact match. std::map probably won't be efficient for this, but a sorted vector should be (can be binary searched). A balanced tree should also work really well.


Some time ago, I used a similar thing to monitor memory leaks:

When asked to allocate size bytes of data, I would alloc size + 4 bytes and store the length of the allocation in the first 4 bytes:

static unsigned int total_still_alloced = 0;
void *sys_malloc(UINT size)
{
#if ENABLED( MEMLEAK_CHECK )
  void *result = malloc(size+sizeof(UINT )); 
  if(result)
  {
    memset(result,0,size+sizeof(UINT ));
    *(UINT *)result = size;
    total_still_alloced += size;
    return (void*)((UINT*)result+sizeof(UINT));
  }
  else
  {
    return result;
  }
#else
  void *result = malloc(size);
  if(result) memset(result,0,size);
  return result;
#endif
}

void sys_free(void *p)
{
  if(p != NULL)
  {
#if ENABLED( MEMLEAK_CHECK )
    UINT * real_address = (UINT *)(p)-sizeof(UINT);
    total_still_alloced-= *((UINT *)real_address);

    free((void*)real_address);
#else
    free(p);
#endif
  }
}

In your case, retrieving the allocated size is a matter of shifting the provided address by 4 and read the value.

Note that if you have memory corruption somewhere... you'll get invalid results. Note also that it is often how malloc works internally: putting the size of the allocation on a hidden field before the adress returned. On some architectures, I don't even have to allocate more, using the system malloc is sufficient.

That's an invasive way of doing it... but it works (provided you allocate everything with these modified allocation routines, AND that you know the starting address of your array).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜