Reading IA32 Assembly Code pertaining to using double arrays
This is the exact question:
The following Code 开发者_StackOverflowtransposes the elements of an M x M array, where M is a constant defined by #define:
void transpose(Marray_t A) {
int i, j;
for (i = 0; i < M; i++)
for (j = 0; j < i; j++) {
int t = A[i][j];
A[i][j] = A[j][i];
A[j][i] = t;
}
}
When compiled with optimization level -02, GCC generates the following code for the inner loop of the function:
1. .L3
2. movl (%ebx),%eax
3. movl (%esi,%ecx,4),%edx
4. movl %eax, (%esi,%ecx,4)
5. addl $1, %ecx
6. movl %edx, (%ebx)
7. addl $52,%ebx
8. cmpl %edi,%ecx
9. jl .L3
A. What is the value of M?
B. What registers hold program values i and j?
C. Write a C code version of transpose that makes use of the optimizations that occur in this loop. Use the parameter M in your code rather than numeric constant.
So in my attempts to understand this, I notice that it is multiplying by 4, which means it stores types of 4 bytes (maybe an int or a pointer). Then, it increments i by $52 (I assume i) since it's a larger value, thus going to the next array) and $52 / 4 is 13. So I would guess that M = 13. Wrong?
For B, I would guess that %ebx contains i and %ecx contains i.
For C, I'm not exactly sure because I don't completely understand the loop presented. Let me try to understand by line number and tell me where I'm wrong. 1. is the beginning of the loop label obviously. 2. moves presumably the value of i into %eax. Then 3. has A[i][j] being stored into t. But... I really don't understand it. :/ Help??
I fixed some mistakes in the assembly code (%ecs
-> %ecx
and (ebx) -> (%ebx)
) I hope I didn't inadvertently introduce new errors.
On to the questions, where you are almost there with the understanding. Let's take the code line by line and attempt to translate it to C.
L3:
2. movl (%ebx),%eax
// We load a 32-bit value from the address stored in ebx. We can't yet deduce the type, but let's assume int for now
int *ebx = ??;
int eax = *ebx;
3. movl (%esi,%ecx,4),%edx
// As you noted we're dealing with 32-bit values, so when converting to C we divide the index by 4
int *esi = ??;
int ecx = ??;
int edx = esi[ecx];
4. movl %eax, (%esi,%ecx,4)
esi[ecx] = eax;
5. addl $1, %ecx
ecx++;
6. movl %edx, (%ebx)
*ebx = edx;
7. addl $52,%ebx
// Again we have to remember to divide by the size of the type used (4)
ebx += 13;
8. cmpl %edi,%ecx
9. jl .L3
int edi = ??;
if (ecx < edi) goto L3;
From this we see that we have some unknown values that are initialized outside the inner loop, but we can also give good guesses as to what they are.
ecx
is incremented each loop iteration and then used to decide if we should continue the loop: it's obviouslyj
from the C code.edi
is the value we comparej
to when deciding whether to loop, but it isn't changed in the inner loop: it'si
.esi
is indexed with row-wise withecx
(j
), so it corresponds to&a[i][j]
.ebx
is incremented by 52 (13 index positions) each loop iteration - as you might have guessed 13 is M - it corresponds to&m[j][i]
and is moved to point at thej
-th row element of the next column each iteration.
Now we can answer the questions:
A. What is the value of M?
13
B. What registers hold program values i and j?
edi
and ecx
respectively.
C. Write a C code version of transpose that makes use of the optimizations that occur in this loop. Use the parameter M in your code rather than numeric constant.
This should be straight-forward at this point.
精彩评论