Are multidimensional arrays (like in C/C++) special cases of ragged arrays? [closed]
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question 开发者_如何学编程I had a discussion with a buddy about whether C++ and C multi-dimensional arrays are special cases of ragged arrays. One point of view was
A multi-dimensional array is not a ragged array, because each element of the multi-dimensional array has the same size. In a ragged array, at least one element has a different size than another element of the same array. ("If it doesn't have the possibility to be ragged, it's not a ragged array.").
The other point of view was
A multi-dimensional array is a special case of a ragged array, where each element has the same size. A ragged array may have rows of different sizes, but doesn't have to. ("A circle is an ellipsis.").
I'm interested in getting a definite answer as to what the common definition of a "ragged array" is in computer science and whether C and C++ multidimensional arrays are ragged arrays or not.
I don't know what the "exact definition" of a ragged array should be but I believe C/C++ multidimensional arrays are definitely not ragged. The reasons for this are the following:
- A ragged array is a term referring the a certain way of "storage in memory" of the arrays such that there is/are at least one pair of rows/cells with different sizes.
- Arrays in C/C++ are pretty straight-forward. The arrays are just a "contiguous block" of memory that is reserved for the structure (array).
- Other high-level languages might have different implementations to save memory etc. but C/C++ arrays don't.
So I believe we cannot call C/C++ arrays ragged.
(Opinion).
EDIT:
And this also heavily depends on the "definition" of ragged. So this is not a well-defined term and so it will be difficult to reach a conclusion. (Should avoid unproductive debates).
A C multidimensional array, if declared as a multidimensional array, cannot be ragged. A 2D array, for example, is an "array of arrays" and each row is the same length, even if you don't use every entry in the array.
int a1[2][3]; // two rows, three columns
int a2[5][8]; // five rows, eight columns
The thing about C, though, is that you can use a pointer-to-a-pointer as if it were a 2D array:
int **a3 = malloc(4);
for (i = 0; i < 4; i++)
a3[i] = malloc(i);
Now a3
can be used in a lot of cases like a 2D array, but is definitely ragged.
IMHO, real arrays cannot be called ragged, but you can certainly fake it out if you have to... the terminology you use doesn't really seem that important from that standpoint.
I'd say the difference is a conceptual one. A multidimensional array
T x[d_1][d_2]...[d_N];
denotes a contiguous area of memory of size $\prod_i d_i$, if you pardon the TeX, and it's addressed in strides: x[i_1]...[i_N] is the element at position $i_N + i_{N-1} d_N + i_{n-2} d_{N-1} d_N + ... + i_1 d_2 ... d_N$. Intermediate indexes can be taken as pointers to the respective subarrays.
A ragged array on the other hand decouples the inner "dimension" from the outer one in memory:
T * r[M];
for (size_t i = 0; i != M; ++i)
r[M] = new T[get_size_at_row(i)];
Whether the sizes actually vary or not is immaterial here, but the conceptual difference is that a ragged array is an array of arrays, whereas a multidimensional array is a far more rigid and coherent object.
When discussing mathematical objects, I think that "ragged" is probably used as a modifier to "array" specifically to mean one that has mismatched secondary dimensions. So that's the first meaning rather than the second. Consider where the word is taken from - we don't say that a brand new handkerchief "is ragged, because it has the potential to fray around the edges, but it hasn't frayed yet". It's not ragged at all. So if we were to call a specific array "ragged", I would expect that to mean "not straight".
However, there will be some contexts in which it's worth defining "ragged array" to mean a "potentially-ragged array" rather than one that actually does have mismatches. For example, if you were going to write a "RaggedArray" class, you would not design in a class invariant that there is guaranteed to be a mismatched size somewhere, and be sure to throw an exception if someone tries to create one with all sizes equal. That would be absurd, despite that fact that you're going to call instances of this class "ragged arrays". So in that context, an array with equal sizes in all elements is a special case of a "ragged array". That's the second meaning rather than the first.
Of course, a C or C++ multi-dimensional array still would not be an instance of this class, but it might satisfy at least the read-only part of some generic interface referred to as "RaggedArray". It's basically a shortcut, that even though we know "ragged" means "having a size mismatch", for most purposes you simply can't be bothered to call that class or generic interface "PotentiallyRaggedArray" just to make clear that you won't enforce the constraint that there must be one.
There's a difference between whether a particular instance of a type has a specific property, and whether the type allows instances of it to have that property, and we frequently ignore that difference when we say that an instance of type X "is an X". Instances of type X potentially have the property, this instance doesn't have it, so this instance in fact does not potentially have the property either. Your two meanings of "ragged array" can be seen as an example of that difference. See the E-Prime crowd, and also the philosophy of Wittgenstein, for the kinds of confusion we create when we say that one thing "is" another, different thing. An instance "is not" a type, and a concrete example does not have the same potential properties as whatever it's an example of.
To specifically answer your question, I doubt that there is a universally-accepted preference for one meaning over the other in the CS literature. It's one of those terms that you just have to define for your own purposes when you introduce it to a given work (an academic paper, the documentation of a particular library, etc). If I could find two papers, one using each, then I'd have proved it, but I can't be bothered with that ;-)
My position would be that ragged array is distinguishable from a multi-dimensional array because it has (must have!) a index that tells you where each of the sub-arrays start. (A ragged array also needs some mechanism for keeping track of the size of each sub-array and while knowing that the sub-arrays are of uniform size will do it is not very general)
You could in principle build a index that connects to the sub-arrays of a standard multi-dimensional array
int arr[6][10]; // <=== Multi-dimensional array
int **ragged = calloc(6,sizeof(int*)); // <=== Ragged array (initially empty)
for (int i=0; i<6 ++i) {
ragged[i] = arr[i]; // <=== make the ragged array alias arr
}
Now I have an two-dimensional array and a two-dimensional ragged array using the same data.
So no, a language multi-dimensional array is not a special case of a ragged array.
精彩评论