开发者

Why can we delete arrays, but not know the length in C/C++?

How is is that it is possible for us to delete dynamically allocated arrays, but we can't find out how many el开发者_JS百科ements they have? Can't we just divide the size of the memory location by the size of each object?


In C++, both...

  • the size (bytes) requested by a new, new[] or malloc call, and
  • the number of array elements requested in a new[] dynamic allocation

...are implementation details that the Standard doesn't require be made available programatically, even though the memory allocation library must remember the former and the compiler the latter so it can invoke the destructor on the correct number of elements.

Sometimes the compiler may see there's a constant-sized allocation and be able to associate it reliably with the corresponding deallocation, so it could generate code customised for these compile-time-known values (e.g. inlining and loop unrolling), but in complex usage (and when handling external inputs) a compiler may need to store and retrieve the # elements at run-time: enough space for the #element counter might be put - for example - immediately before or after the address returned for the array content, with delete[] knowing about this convention. In practice, a compiler may choose to always handle this at run-time just for the simplicity that comes with consistency. Other run-time possibilities exist: e.g. the # elements might be derivable from some insight into the specific memory pool from which the allocation was satisfied combined with the object size.

The Standard doesn't provide programmatic access to ensure implementations are unfettered in the optimisations (in speed and/or space) they may use.

(The size of the memory location may be greater than the exact size required for the requested number of elements - that size is remembered by the memory allocation library, which may be a black-box library independent of the C++ compiler).


The memory allocator remembers the size of the allocation, but doesn't give it to the user. This is true in C with malloc and in C++ with new.

"The size of the memory location" cannot be obtained. If you do

int *a = new int[N];
std::cout << sizeof(a);

you'll find that it prints sizeof(int *), which is constant (for a given platform).


The common way in C++ is to use std::vector instead of array.

std::vector has the method size which returns the number of elements.

If possible you should prefer using std::vector instead of array wherever possible.


The reason is that the C languages do not expose this information, although it might be available to the specific implementation. (Indeed for array new[] in C++ the size has to be tracked to call the destructors for each object -- but how this is done is up to the specific compiler.)

The reason for this non-disclosure is so that compiler-writers and platform implementers have more freedom in how they implement variable-size memory allocations. It is also not necessary to know this information in general, so it would not make sense to require each C platform to make this info available.

Also, one practical reason (for malloc et al.) is that they do not give you what you asked for: If you ask malloc for 30 bytes of memory, it will most likely give you 32 bytes (or some other larger allocation granularity). So the only information available internally is the 32 bytes, and you as programmer don't have much use for this information.


Two things work against it

  1. first, arrays and pointers are interchangeable - an array does not have any additional understanding of its length. ( *All smart-arse commentators tempted to comment on the fundamental differences between arrays and pointers should note that none of that makes any difference in respect to this point ;) * )

  2. secondly, because knowing the size of the allocation is the business of the heap, and the heap does not expose any standard way of discovering the size of the allocation.

Symbian, however, does have an AllocSize() function from which you can derive how many elements are in the array. However, sometimes allocations are larger than asked for, because it manages memory in word-aligned chunks.


you can easily make a class to keep track of the allocation count.

the reason we don't know the length is because it has always been an implementation detail (afaik). the compiler knows the elements' alignment, and the abi will also affect how it is implemented.

for example, itanium 64 abi stores the cookie (element count) in the leading bytes of the allocation (specifically, non-POD), then pads to the objects' natural alignment if necessary. you are then returned (from new[]) the address of the first usable element, rather than the address of the actual allocation. so there is a bunch of non-portable bookkeeping involved.

a wrapper class is the easy way to manage this.

it's actually an interesting exercise to write allocators, override object::new/delete, placement operators and look at how this all fits together (although it's not a particularly trivial exercise if you want the allocator to be used in production code).

in short, we don't know the size of the memory allocation, and it is more effort to figure out the allocation size (among other necessary variables) consistently across multiple platforms than it is to use a custom template class which holds a pointer and a size_t.

furthermore, there is no guarantee that the allocator allocated exactly the number of bytes requested (so your counts could be wrong, if you determine count based on allocation size). if you go through malloc interfaces, you should be able to locate your allocation... but that's still not very useful, portable, or safe for any non-trivial case.

Update:

@Default there are many reasons to create your own interface. as Tony mentioned, std::vector is one well known implementation. the basis for such a wrapper is simple (interface borrowed from std::vector:

/* holds an array of @a TValue objects which are created at construction and destroyed at destruction. interface borrows bits from std::vector */
template<typename TValue>
class t_array {
    t_array(const t_array&); // prohibited
    t_array operator=(const t_array&); // prohibited
    typedef t_array<TValue>This;
public:
    typedef TValue value_type;
    typedef value_type* pointer;
    typedef const value_type* const_pointer;
    typedef value_type* const pointer_const;
    typedef const value_type* const const_pointer_const;
    typedef value_type& reference;
    typedef const value_type& const_reference;

    /** creates @a count objects, using the default ctor */
    t_array(const size_t& count) : d_objects(new value_type[count]), d_count(count) {
        assert(this->d_objects);
        assert(this->d_count);
    }

    /** this owns @a objects */
    t_array(pointer_const objects, const size_t& count) : d_objects(objects), d_count(count) {
        assert(this->d_objects);
        assert(this->d_count);
    }

    ~ t_array() {
        delete[] this->d_objects;
    }

    const size_t& size() const {
        return this->d_count;
    }

    bool empty() const {
        return 0 == this->size();
    }

    /* element access */
    reference at(const size_t& idx) {
        assert(idx < this->size());
        return this->d_objects[idx];
    }

    const_reference at(const size_t& idx) const {
        assert(idx < this->size());
        return this->d_objects[idx];
    }

    reference operator[](const size_t& idx) {
        assert(idx < this->size());
        return this->d_objects[idx];
    }

    const_reference operator[](const size_t& idx) const {
        assert(idx < this->size());
        return this->d_objects[idx];
    }

    pointer data() {
        return this->d_objects;
    }

    const_pointer data() const {
        return this->d_objects;
    }

private:
    pointer_const d_objects;
    const size_t d_count;
};

as useful as std::vector is, there are some cases where it can be useful to create your own bases:

  • to make an object with a smaller interface. minimalism is good.
  • to make an object which requires no allocator. for example: t_array will result in fewer exported symbols, as well as shorter names for those symbols (by removing the allocator argument).
  • to make a variants which handle additional const cases. in the example above, there is often little reason to change what the container points to. so the above t_array uses 2 const members, each ensure less variation than std::vector. a good optimizer should make use of those details. it also prevents users from making accidental mistakes.
  • to reduce build times. if your needs are as simple as t_array, or even more simple then you can reduce your build times by using a minimal interface.

other cases:

  • to make an object with a larger interface, or more features
  • to make an object with additional debugging facilities
  • to make an object which may be subclassed (most implementations of std::vector are not intended to be subclassed)
  • to make an object which is thread safe


It's all perfectly in the "keep it simple" philosophy of C: you MUST, at one point, have decided what size the array/buffer/whatever needed; so keep that value and that's it. why wasting a function call for retreaving an information you already have?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜