Pointer addition vs subtraction
$5.7 -
"[..]For addition, either both operands shall have arithmetic or enumeration type, or one operand shall be a pointer to a completely defined object type and the other shall have integral or enumeration type.
2 For subtraction, one of the following shall hold: — both operands have arithmetic or enumeration type; or — both operands开发者_如何学运维 are pointers to cv-qualified or cv-unqualified versions of the same completely defined object type; or — the left operand is a pointer to a completely defined object type and the right operand has integral or enumeration type.
int main(){
int buf[10];
int *p1 = &buf[0];
int *p2 = 0;
p1 + p2; // Error
p1 - p2; // OK
}
So, my question is why 'pointer addition' is not supported in C++ but 'pointer subtraction' is?
The difference between two pointers means the number of elements of the type that would fit between the targets of the two pointers. The sum of two pointers means...er...nothing, which is why it isn't supported.
The result of subtraction is distance (useful).
The result of adding a pointer and a distance is another meaningful pointer.
The result of adding 2 pointers is another pointer, this time meaningless though.
It's the same reason there are distinct TimeSpan and DateTime objects in most libraries.
The first thing that comes to mind is that it doesn't make sense to do pointer addition, so it's not supported. If you have 2 pointers 0x45ff23dd, 0x45ff23ed
. What does it mean to add them?? Some memory out-of-bounds. And people in standard comittee have not found good enough reasons to support stuff like that, and rather warn you at compile time about possible problem. While pointer subtraction is fine because it indicates memory distance, which is often useful.
Because adding two pointers doesn't make sense.
Consider I have two int
s in memory at 0x1234
and 0x1240
. The difference between these addresses is 0xc
and is a distance in memory. The sum is 0x2474
and doesn't correspond to anything meaningful.
You can however, add a pointer to an integer to get another pointer. This is what you do when you index into an array: p[4] means *(p + 4) which means "the thing stored at the address 4 units past this address."
In general, you can determine the "pointerness" of an arithmetic operation by assigning each pointer a value 1 and each integer a value zero. If the result is 1, you've got a pointer; if it's 0, you've got an integer; if it's any other value, you have something that doesn't make sense. Examples:
/* here p,q,r are pointers, i,j,k are integers */
p + i; /* 1 + 0 == 1 => p+i is a pointer */
p - q; /* 1 - 1 == 0 => p-q is an integer */
p + (q-r); /* 1 + (1-1) == 1 => pointer */
Result of pointer subtraction is number of objects between two memory addresses. Pointer addition doesn't mean anything, this is why it is not allowed.
subtraction of pointers is only defined if they point into the same array of objects. The resulting subtraction is scaled by the size of object they point to. ie, pointer subtraction gives the number of elements between the two pointers.
N.B. No claims on C standards here.
As a quick addendum to @Brian Hooper's answer, "[t]he sum of two pointers means...er...nothing" however the sum of a pointer and an integer allows you to offset from the initial pointer.
Subtracting a higher value pointer from a lower value pointer gives you the offset between the two. Note that I am not accounting for memory paging here; I'm assuming the memory values are both within the accessible scope of the process.
So if you have a pointer to a series of consecutive memory locations on the heap, or an array of memory locations on the stack (whose variable name decays to a pointer), these pointers (the real pointer and the one that decays to a pointer) will point to the fist memory location question (i.e. element [0]
). Adding an integer value to the pointer is equivalent to incrementing the index in brackets by the same number.
#include <stdio.h>
#include <stdlib.h>
int main()
{
// This first declaration does several things (this is conceptual and not an exact list of steps the computer takes):
// 1) allots space on the stack for a variable of type pointer
// 2) allocates number of bytes on the heap necessary to fit number of chars in initialisation string
// plus the NULL termination '\0' (i.e. sizeof(char) * <characters in string> + 1 for '\0')
// 3) changes the value of the variable from step 1 to the memory address of the beginning of the memory
// allocated in step 2
// The variable iPointToAMemoryLocationOnTheHeap points to the first address location of the memory that was allocated.
char *iPointToAMemoryLocationOnTheHeap = "ABCDE";
// This second declaration does the following:
// 1) allots space on the stack for a variable that is not a pointer but is said to decay to a pointer allowing
// us to so the following iJustPointToMemoryLocationsYouTellMeToPointTo = iPointToAMemoryLocationOnTheHeap;
// 2) allots number of bytes on the stack necessary to fit number of chars in initialisation string
// plus the NULL termination '\0' (i.e. sizeof(char) * <characters in string> + 1 for '\0')
// The variable iPointToACharOnTheHeap just points to first address location.
// It just so happens that others follow which is why null termination is important in a series of chars you treat
char iAmASeriesOfConsecutiveCharsOnTheStack[] = "ABCDE";
// In both these cases it just so happens that other chars follow which is why null termination is important in a series
// of chars you treat as though they are a string (which they are not).
char *iJustPointToMemoryLocationsYouTellMeToPointTo = NULL;
iJustPointToMemoryLocationsYouTellMeToPointTo = iPointToAMemoryLocationOnTheHeap;
// If you increment iPointToAMemoryLocationOnTheHeap, you'll lose track of where you started
for( ; *(++iJustPointToMemoryLocationsYouTellMeToPointTo) != '\0' ; ) {
printf("Offset of: %ld\n", iJustPointToMemoryLocationsYouTellMeToPointTo - iPointToAMemoryLocationOnTheHeap);
printf("%s\n", iJustPointToMemoryLocationsYouTellMeToPointTo);
printf("%c\n", *iJustPointToMemoryLocationsYouTellMeToPointTo);
}
printf("\n");
iJustPointToMemoryLocationsYouTellMeToPointTo = iPointToAMemoryLocationOnTheHeap;
for(int i = 0 ; *(iJustPointToMemoryLocationsYouTellMeToPointTo + i) != '\0' ; i++) {
printf("Offset of: %ld\n", (iJustPointToMemoryLocationsYouTellMeToPointTo + i) - iPointToAMemoryLocationOnTheHeap);
printf("%s\n", iJustPointToMemoryLocationsYouTellMeToPointTo + i);
printf("%c\n", *iJustPointToMemoryLocationsYouTellMeToPointTo + i);
}
printf("\n");
iJustPointToMemoryLocationsYouTellMeToPointTo = iAmASeriesOfConsecutiveCharsOnTheStack;
// If you increment iAmASeriesOfConsecutiveCharsOnTheStack, you'll lose track of where you started
for( ; *(++iJustPointToMemoryLocationsYouTellMeToPointTo) != '\0' ; ) {
printf("Offset of: %ld\n", iJustPointToMemoryLocationsYouTellMeToPointTo - iAmASeriesOfConsecutiveCharsOnTheStack);
printf("%s\n", iJustPointToMemoryLocationsYouTellMeToPointTo);
printf("%c\n", *iJustPointToMemoryLocationsYouTellMeToPointTo);
}
printf("\n");
iJustPointToMemoryLocationsYouTellMeToPointTo = iAmASeriesOfConsecutiveCharsOnTheStack;
for(int i = 0 ; *(iJustPointToMemoryLocationsYouTellMeToPointTo + i) != '\0' ; i++) {
printf("Offset of: %ld\n", (iJustPointToMemoryLocationsYouTellMeToPointTo + i) - iAmASeriesOfConsecutiveCharsOnTheStack);
printf("%s\n", iJustPointToMemoryLocationsYouTellMeToPointTo + i);
printf("%c\n", *iJustPointToMemoryLocationsYouTellMeToPointTo + i);
}
return 1;
}
The first notable thing we do in this program is copy the value of the pointer iPointToAMemoryLocationOnTheHeap
to iJustPointToMemoryLocationsYouTellMeToPointTo
. So now both of these point to the same memory location on the heap. We do this so we don't lose track of the beginning of it.
In the first for
loop we increment the value we just copied into iJustPointToMemoryLocationsYouTellMeToPointTo
(increasing it by 1 means it points to one memory location further away from iPointToAMemoryLocationOnTheHeap
).
The second loop is similar but I wanted to more clearly show how incrementing the value is related to the offset, and how the arithmetic works.
The third and fourth loops repeat the process but work on an array on the stack as opposed to allocated memory on the heap.
Note the asterisk *
when printing the individual char
. This tells printf to output whatever is pointed to by the variable, and not the contents of the variable itself. This is in contrast to the line above where the balance of the string is printed and there is no asterisk before the variable because printf() is looking at the series of memory locations in its entirety until NULL is reached.
Here is the output on ubuntu 15.10 running on an i7 (the first and third loops output starting at an offset of 1 because my loop choice of for
increments at the beginning of the loop as opposed to a do{}while()
; I just wanted to keep it simple):
Offset of: 1
BCDE
B
Offset of: 2
CDE
C
Offset of: 3
DE
D
Offset of: 4
E
E
Offset of: 0
ABCDE
A
Offset of: 1
BCDE
B
Offset of: 2
CDE
C
Offset of: 3
DE
D
Offset of: 4
E
E
Offset of: 1
BCDE
B
Offset of: 2
CDE
C
Offset of: 3
DE
D
Offset of: 4
E
E
Offset of: 0
ABCDE
A
Offset of: 1
BCDE
B
Offset of: 2
CDE
C
Offset of: 3
DE
D
Offset of: 4
E
E
Because the result of that operation is undefined. Where does p1 + p2 point to? How can you make sure it points to a properly initialized memory so that it could be dereferenced later? p1 - p2 gives the offset between those 2 pointers and that result could be used further on.
精彩评论