C++ Performance of structs used as a safe packaging of arrays
In C or C++, there is no checking of arrays for out of bounds. One way to work around this is to package it with a struct:
struct array_of_foo{
int length;
foo *arr; //array with variable length.
};
Then, it can be initialized:
array_of_foo *ar(int length){
array_of_foo *out = (array_of_foo*) malloc(sizeo开发者_开发知识库f(array_of_foo));
out->arr = (foo*) malloc(length*sizeof(foo));
}
And then accessed:
foo I(array_of_foo *ar, int ix){ //may need to be foo* I(...
if(ix>ar->length-1){printf("out of range!\n")} //error
return ar->arr[ix];
}
And finally freed:
void freeFoo(array_of_foo *ar){ //is it nessessary to free both ar->arr and ar?
free(ar->arr); free(ar);
}
This way it can warn programmers about out of bounds. But will this packaging slow down the preformance substantially?
I agree on the std::vector
recommendation. Additionally you might try boost::array libraries, which include a complete (and tested) implementation of fixed sized array containers:
http://svn.boost.org/svn/boost/trunk/boost/array.hpp
In C++, there's no need to come up with your own incomplete version of vector
. (To get bounds checking on vector
, use .at()
instead of []
. It'll throw an exception if you get out of bounds.)
In C, this isn't necessarily a bad idea, but I'd drop the pointer in your initialization function, and just return the struct. It's got an int and a pointer, and won't be very big, typically no more than twice the size of a pointer. You probably don't want to have random printf
s in your access functions anyway, as if you do go out of bounds you'll get random messages that won't be very helpful even if you look for them.
Most likely the major performance hit will come from checking the index for every access, thus breaking pipelining in the processor, rather than the extra indirection. It seems to me unlikely that an optimizer would find a way to optimize away the check when it's definitely not necessary.
For example, this will be very noticed in long loops traversing the entire array - which is a relatively common pattern.
And just for the sake of it:
- You should initialize the length field too in ar()
- You should check for ix < 0
in I()
I don't have any formal studies to cite, but echoes I've had from languages where array bound checking is optional is that turning it off rarely speeds up a program down perceptibly.
If you have C code that you'd like to make safer, you may be interested in Cyclone.
You can test it yourself, but on certain machines you may have serious performance issues under different scenarios. If you are looping over millions of elements, then checking the bounds every time will lead to numerous cache misses. How much of an impact that will have depends on what your code is doing. Again, you could test this pretty quickly.
精彩评论