开发者

Why does the C++ standard algorithm "count" return a difference_type instead of size_t?

Why is the return type of std::count the difference_type of the iterators (often a ptrdif开发者_StackOverflowf_t).

Since count can never be negative, isn't size_t technically the right choice? And what if the count exceeds the range of ptrdiff_t since the theoretical possible size of an array can be size_t?


EDIT: So far there is no suitable answer as to why the function returns ptrdiff_t. Some explanation gathered from the answers below is that the return type is iterator_traits<InputIterator>::difference_type which is generic and can be anything. Up until that point it makes sense. There are cases where the count may exceed size_t. However, it still does not make sense why the return type is typedef ptrdiff_t iterator_traits<InputIterator>::difference_type for the standard iterators instead of typedef size_t iterator_traits<InputIterator>::difference_type.


The std::count() algorithm relies on the iterator type to define an integral type large enough to represent any size of a range. Possible implementation of containers include files and network streams, etc. There is no guarantee that the entire range fits into the process' address space at once, so std::size_t might be too small.

The only integral type offered by the standard std::iterator_traits<> is std::iterator_traits<>::difference_type, which is suitable for representing "distances" between two iterators. For iterators implemented as (wrappers of) pointers, this type is std::ptrdiff_t. There is no size_type or the like from iterator traits, so there is no other choice.


size_t is not technically the correct choice, since it might not be big enough. Iterators are permitted to iterate over "something" that is larger than any object in memory -- for example a file on disk. When they do so, the iterator can define a type larger than size_t as its difference_type, if one is available.

difference_type needs to be signed because in contexts other than std::count it represents offsets between iterators in both directions. For random access iterators, it + difference is a perfectly sensible operation even when difference is negative.

iterator_traits doesn't offer an unsigned type. Maybe it should, but given that it doesn't iterator_traits<InputIterator>::difference_type is the best type available.

The issue of whether iterators should offer an unsigned type probably relates to a massive conflict of coding styles, whether unsigned types should be used for counts at all. I don't propose to reproduce that argument here, you can look it up. ptrdiff_t does have a weakness that on some systems it cannot represent all valid pointer differences, and hence also cannot represent all expected results of std::count.

As far as I can tell, even in C++03 the standard actually forbade this, maybe by accident. 5.7/6 talks about subtraction possibly overflowing ptrdiff_t, just like C does. But table 32 (allocator requirements) says that X::difference_type can represent the difference between any two pointers, and std::allocator is guaranteed to use ptrdiff_t as its difference_type (20.1.5/4). C++11 is similar. So one part of the standard thinks that pointer subtraction can overflow ptrdiff_t, and another part of the standard says it can't.

std::count presumably was designed under the same (possibly defective) assumption as the allocator requirements, that ptrdiff_t is big enough to express the size of any object and (in general) an iterator's difference_type can express the count of iterands between any two iterators.


The return type is typename iterator_traits<InputIterator>::difference_type which in this particular case happens to be ptrdiff_t.

Presumably difference_type was selected because the maximum number of matching elements in the range would be the iterator difference last - first.


Originally std::count was:

template <class InputIterator, class EqualityComparable, class Size>
void count(InputIterator first, InputIterator last, 
           const EqualityComparable& value,
           Size& n);

In that function Size is a template parameter. It can be whatever you like, and it's your responsibility to make sure it's correct. It could be the longest type on your platform.

My suspicion is that when the newer form:

template <class InputIterator, class EqualityComparable>
iterator_traits<InputIterator>::difference_type
count(InputIterator first, InputIterator last, 
      const EqualityComparable& value);

was added iterator_traits was already in existence, so re-using the existing type had the advantage that it kept the changes to the standard small and localised, compared to adding another typedef in iterator_traits.

Doing it this way, using iterator_traits as opposed to simply using std::size_type means that every possible iterator gets the option to specify exactly what type should be returned by std::count. This includes custom iterators which read from a network, or disk, which can use something much larger than either ptrdiff_t or size_type and friends. (It could be some kind of "BigInt" if needed). It also means that the user isn't responsible for deducing the appropriate type to use though, which can be tricky, precisely because of the custom iterator possibility.


Even though a count can't be negative, the return type is specified as iterator_traits<InputIterator>::difference_type and the difference between two iterators can be negative.


If the iterator was an array, it would imply the result is within the range of the array.

For this specific algorithm I can't think of a reason that is interesting. For someone using this as a component it may be interesting, though.

The page does say that it would do something equivalent. So for the case of an array it may do something like a direct pointer difference. This would be a pretty fast specialization if it were applicable.


difference_type usually denotes the type suitable to denote a distance in an array or similar. The following wording is from the allocator requirements, but whenever the standard talks about difference_type it means the same concept:

a type that can represent the difference between any two pointers in the allocation model

The natural type for this is ptrdiff_t.

For the size_type it says:

a type that can represent the size of the largest object in the allocation model.

The natural type here is size_t.

Now for the count of any elements in a range (or an array) does need at least the type suitable to specify the difference last-first. It seems most natural to chose that one.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜