For input iterators, why a == b does not imply ++a == ++b?
§24.1.1/3 from C++03 Standard reads,
For input iterators, a == b does not imply ++a == ++b. (Equality does not guarantee the substitution property or referential transparency.) Algorithms on input iterators should never attempt to pass through the same iterator twice. They should be single pass algorithms. Value type T is not required to be an Assignable type (23.1). These algorithms can be used with istreams as the source of the input data through开发者_开发百科 the istream_iterator class.
I couldn't understand the bold text in the above quotation. Can anyone help me understanding this?
Also, what does the following statement (italicized text in the above quotation) mean? How is it related to a==b
and ++a==++b
expressions?
Equality does not guarantee the substitution property or referential transparency.
For Input Iterators, incrementing an iterator invalidates copies of the same iterator.
So:
auto a = istream_iterator<whatever>(something);
auto b = a;
a == b; // true
++a; // b is now invalid
++b; // undefined behavior, I think, but in any case not guaranteed to
// result in anything sensible.
So certainly ++a == ++b
is not guaranteed. That is, a == b
does not imply ++a == ++b
.
I think the "substitution property" means "anything you do with the value of a
has the same result as doing the same with the value of b
", or similar - there are various versions of substitution that it might refer to, but something along those lines. I think in this context it must mean "later doing the same with b
", since if a == b
and I haven't done anything invalid yet, then it doesn't matter which of a
and b
I use, they refer to the same point in the stream. But when I increment, I do have to pick one and lose the other, hence the difficulty with ++a == ++b
.
"Referential transparency" means that different objects are independent, i.e. they're not references/pointers to or aliases of each other. In combination with "substitution property" this means:
Later? There is no earlier or later since operations don't have global side-effects. If you can't substitute "later" then you can't substitute
Input iterators in the same sequence typically refer to the same "actual data", like a file handle or whatever, which itself contains mutable state. Since a
and b
refer to the same file handle, and their value is dependent on its state, you don't have referential transparency. This lack is why substitution fails.
Forward iterators typically also refer to the same underlying data (like a container), but as long as you use them read-only (and don't otherwise modify the container), they don't betray this fact, at least not until you start comparing the addresses of the values they return. So they have a limited kind of referential transparency of their own value, that input iterators don't. They're still references themselves, so the things they refer to are still aliased.
As the explanation says: “They should be single pass algorithms.”
The point is that an iterator on an input stream represents a transient state. Once the iterator is changed, that state no longer exists; all other iterators representing that state are invalidated: once you increment a
, the iterator b
becomes invalid.
The properties referred to are:
Substitution Property
For any quantities a and b and any expression F(x), if a = b, then F(a) = F(b) (if either side makes sense, i.e. is well-formed).
Referential Transparency
Informally, it means that there is no difference between a value and a reference to this value (thus how the term was coined).
In imperative programming it is a difficult concept to get, because we are used to modify our variables. Rick Hickey (behind Clojure) gives a nice talk about the distinction between Identity and State that may help you. The gist of it is that a variable is an Identity. At any point in time an Identity refers to a State. A State never changes, however an Identity may be altered to refer to another State.
Input Iterators
The substitution property violation is "obvious" here, if we define F(x)
in the above to mean ++x
, then we have that if input iterators verified the substitution property, the following would hold a == b => ++a == ++b
.
This is not true, however, because incrementing an input iterator may invalidate all other input iterators from the same source. From table 107 in n3290 (page 831, just above the paragraph you quoted):
++r
pre: r is dereferenceable.
post: r is dereferenceable or r is past-the-end.
post: any copies of the previous value of r are no longer required either to be dereferenceable or to be in the domain of ==.
That is, when we perform ++a
, then b
may become invalid, and therefore ++b
itself will be undefined behavior.
This is a direct violation of ++a == ++b
, therefore the substitution property does not hold.
The referential transparency is a bit more evident here. If Input Iterators were referentially transparent, it would mean that they would be indifferentiable from the value they point to. Clearly this is not the case, as applying ++
does not increment the value, but the iterator.
Input iterators define a sequence which can only be read once; input
from a keyboard or a pipe would be a good example. Incrementing an
istream_iterator
effectively means reading further in the istream
,
extracting characters, so other istream_iterator
on the same stream
will no longer be valid with regards to their position. Imagine
a stream of characters "abcdefg..."
(the alphabet, in sum), with two
istream_iterator<char>
pointing to the 'a'
. Incrementing one of
them will cause the 'b'
to be read from the stream, and no other
iterator can ever see it.
Consider that an input iterator could be connected to a stream reading from the keyboard. Incrementing the iterator means reading the next character.
Also incrementing a copy of the iterator does not mean reading the same character.
精彩评论