C/C++: "Grid[x + y * width]" versus "Grid[x][y]"
This is extremely specific and has almost no ramifications, but it has always managed to bother me just because I didn't know which would be generally better. I开发者_开发问答 am hoping you fine folks will help me settle on one:
Something ** grid; grid[x][y];
Something * grid; grid[x + y * width];
I know anyone who has programmed has had to create a 2-dimensional array at some point. Which did you choose and what made you go that route? Or perhaps you used another form altogether?
Usually the second method is preferred, for a number of reasons:
- A single multiplication is marginally faster than an extra array-dereference
- Keeping the array contiguous also marginally increases speed, due to cache-hits
- In many cases, such as loading a bitmap file, or working work DirectX/OpenGL surfaces, it is necessary to keep a 2D surface in one contiguous block of memory.
- It only requires a single array allocation/deallocation
- As a rule, it is easier to deal with pointers than pointers-to-pointers
As has been mentioned by several others, if you know the width at compile-time, declaring the variable as int grid[][width]
will give you all the above advantages, with nicer syntax. Obviously if the width is dynamic, this is not an option.
in first case you must create new pointer for each "x" in second case, you will immediately allocate memory my choice is
Something * grid; grid[x + y * width];
its less possibility of mistakes, e.g. access violation when you forgot to create subarray grid[x] with some x
In C, I would use typedef Something grid_t[][width]; grid[y][x]
(note the order and the explicit array size). I prefer not to use the jagged array Something **grid
.
In C++, I would use grid.get(x, y)
, and inside the implemention would be either of them, so I can easily switch to a better representation when needed (e.g. triangular matrices). Most likely I would start with a std::vector<Something>(height * width)
to store the actual data.
Do you access it linearly? The second would be better cache line use.
But the first one is easier to read.
Use the second form:
- LAPACK / CBLAS and most other external libraries requires the second form.
- Representation is more compact [avoids additional vector of pointers, as in the first case]
- Representation is faster (grid[x][y] requires two pointer dereferences, not one)
- malloc is easier to handle [no need to do two-steps of malloc]
for contiguous access, the single-array version is more efficient. for random access, the single-array version still is more efficient [only requires one pointer dereference]
You could go either way with it, whatever you like (assuming you dont literally mean it is defined as something **grid, see ultor or Sjoerd's answers). The first one is usually easier to type, the second one gives you more flexibility and control. Much of the time they will produce the same instructions, so to some extent it comes down to personal typing and coding habits, readability, maintainability, etc.
In the first form the array doesn't have to be rectangular, that is each "line" can be a different size. I would generally prefer the second unless there is a concrete reason to treat each line seperately. It is more memory efficient, cacheable, faster.
A concrete example of the advantage is if you need to make a copy of the array. It is a single memcpy() using the second form but a lot more complicated with the first.
For C++ there is also Boost.MultiArray
精彩评论