Need help disassembling [closed]
I need your help. Here is the source code of my program. I need to understand what manipulations are being done by the score1, score2, score3 and score4 functions.
1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4 #include <sys/types.h>
5 #include <sys/stat.h>
6 #include <pwd.h>
7 #include <unistd.h>
8
9 #include "score.h"
10
(gdb)
11 int main(int argc, char *argv[])
12 {
13 int i, j, k, l, s;
14 struct passwd *pw;
15 char cmd[1024];
16
17 /* Make sure that we have exactly 5 arguments: the name of the executable, and 4 numbers */
18 if (argc != 5) {
19 printf("Usage: %s i j k l\n where i,j,k,l are integers.\n Try to get as high a score as you can.\n", argv[0]);
20 exit(8);
(gdb)
21 }
22
23 initialize();
24
25 /* Convert the inputs to ints */
26 i = atoi(argv[1]);
27 j = atoi(argv[2]);
28 k = atoi(argv[3]);
29 l = atoi(argv[4]);
30
(gdb)
31 printf("You entered the integers %d, %d, %d, and %d.\n", i, j, k, l);
32 s = score1(i) + score2(j) + score3(k) + score4(l);
33
34 printf("Your score is %d.\n", s);
35 if (s > 0) {
36 pw = getpwuid(getuid());
37
38 printf("Thank you!\n");
40 system(cmd);
I have started disassemble the code like the following:
(gdb) disas score1
Dump of assembler code for function score1:
0x080488b0 <score1+0>: push %ebp
0x080488b1 <score1+1>: mov %esp,%ebp
0x080488b3 <score1+3>: cmpl $0xe1e4,0x8(%ebp)
0x080488ba <score1+10>: setne %al
0x080488bd <score1+13>: movzbl %al,%eax
0x080488c0 <score1+16>: sub $0x1,%eax
0x080488c3 <score1+19>: and $0xa,%eax
0x080488c6 <score1+22>: pop %ebp
0x080488c7 <score1+23>: ret
(gdb) disas score2
Dump of assembler code for function score2:
0x080488c8 <score2+0>: push %ebp
0x080488c9 <score2+1>: mov %esp,%ebp
0x080488cb <score2+3>: mov 0x8049f88,%eax
0x080488d0 <score2+8>: sub $0x2,%eax
0x080488d3 <score2+11>: mov %eax,0x8049f88
0x080488d8 <score2+16>: cmp 0x8(%ebp),%eax
0x080488db <score2+19>: setne %al
0x080488de <score2+22>: movzbl %al,%eax
0x080488e1 <score2+25>: sub $0x1,%eax
0x080488e4 <score2+28>: and $0xa,%eax
0x080488e7 <score2+31>: pop %ebp
0x080488e8 <score2+32>: ret
(gdb) disas score3
Dump of assembler code for function score3:
0x080488e9 <score3+0>: push %ebp
0x080488ea <score3+1>: mov %esp,%ebp
0x080488ec <score3+3>: mov 0x8(%ebp),%eax
0x080488ef <score3+6>: and $0xf,%eax
0x080488f2 <score3+9>: mov 0x8048e00(,%eax,4),%eax
0x080488f9 <score3+16>: pop %ebp
0x080488fa <score3+17>: ret
(gdb) disas score4
Dump of assembler code for function score4:
0x080488fb <score4+0>: push %ebp
0x080488fc <score4+1>: mov %esp,%ebp
0开发者_运维技巧x080488fe <score4+3>: push %ebx
0x080488ff <score4+4>: mov 0x8(%ebp),%eax
0x08048902 <score4+7>: movzwl %ax,%edx
0x08048905 <score4+10>: mov %eax,%ecx
0x08048907 <score4+12>: shr $0x10,%ecx
0x0804890a <score4+15>: lea 0x0(,%edx,8),%eax
0x08048911 <score4+22>: sub %edx,%eax
0x08048913 <score4+24>: cmp %ecx,%eax
0x08048915 <score4+26>: jne 0x8048920 <score4+37>
0x08048917 <score4+28>: mov $0x8000ffff,%ebx
0x0804891c <score4+33>: test %edx,%ecx
0x0804891e <score4+35>: jne 0x8048940 <score4+69>
0x08048920 <score4+37>: mov %ecx,%eax
0x08048922 <score4+39>: xor %edx,%eax
0x08048924 <score4+41>: cmp $0xf00f,%eax
0x08048929 <score4+46>: jne 0x804893b <score4+64>
0x0804892b <score4+48>: mov %ecx,%eax
0x0804892d <score4+50>: or %edx,%eax
0x0804892f <score4+52>: mov $0xa,%ebx
0x08048934 <score4+57>: cmp $0xf42f,%eax
---Type <return> to continue, or q <return> to quit---
0x08048939 <score4+62>: je 0x8048940 <score4+69>
0x0804893b <score4+64>: mov $0x0,%ebx
0x08048940 <score4+69>: mov %ebx,%eax
0x08048942 <score4+71>: pop %ebx
0x08048943 <score4+72>: pop %ebp
0x08048944 <score4+73>: ret
I've started examining score2. What I have done is:
(
gdb) x 0x8049f88
0x8049f88 <secret>: "Чй"
(gdb) disas 0x8049f88
Dump of assembler code for function secret:
0x08049f88 <secret+0>: dec %dl
0x08049f8a <secret+2>: add %al,(%eax)
End of assembler dump.
And I'm lost here.
Here's what I think happens so far (See comments):
(gdb) disas score2
Dump of assembler code for function score2:
0x080488c8 <score2+0>: push %ebp
0x080488c9 <score2+1>: mov %esp,%ebp 'Copy %esp into %ebp
0x080488cb <score2+3>: mov 0x8049f88,%eax 'executing: decrement and add
0x080488d0 <score2+8>: sub $0x2,%eax ' subtract $0x2 from %eax (How can I figure out what $0x2
0x080488d3 <score2+11>: mov %eax,0x8049f88 'Have no idea what this does
0x080488d8 <score2+16>: cmp 0x8(%ebp),%eax compare of %ebp to %eax (why %ebp has 0x8 preceding it?)
0x080488db <score2+19>: setne %al 'I have no idea what this does
0x080488de <score2+22>: movzbl %al,%eax
0x080488e1 <score2+25>: sub $0x1,%eax
0x080488e4 <score2+28>: and $0xa,%eax
0x080488e7 <score2+31>: pop %ebp
0x080488e8 <score2+32>: ret
If you could help me understand what kind of transformations score2 performs to an integer and what commands can I run in gdb that could help me, I would really appreciate it and would try to figure rest of it(score1-3) by myself. I'm just lost here.
There's only really 2 things you need to know to understand a disassembly. The first thing you need to know is all the instructions and addressing modes support by the CPU and how they work. The second thing is the syntax used by the assembler/disassembler. Without being familiar with either of these things you will get nowhere.
For an example of "you will get nowhere", here's score2:
0x080488c8 <score2+0>: push %ebp ;Save EBP
0x080488c9 <score2+1>: mov %esp,%ebp ;EBP = address of stack frame
0x080488cb <score2+3>: mov 0x8049f88,%eax ;EAX = the data at address 0x8049f88
0x080488d0 <score2+8>: sub $0x2,%eax ;EAX = EAX - 2
0x080488d3 <score2+11>: mov %eax,0x8049f88 ;The value at address 0x8049f88 = eax
0x080488d8 <score2+16>: cmp 0x8(%ebp),%eax ;Compare the int at offset 8 in the stack frame with EAX
0x080488db <score2+19>: setne %al ;If the int at offset 8 in the stack frame wasn't equal to EAX, set AL to 0, otherwise set AL to 1
0x080488de <score2+22>: movzbl %al,%eax ;Zero-extend AL to EAX (so EAX = 0 or 1)
0x080488e1 <score2+25>: sub $0x1,%eax ;Decrease EAX (so EAX = -1 or 0)
0x080488e4 <score2+28>: and $0xa,%eax ;EAX = EAX AND 0x0A (so EAX = 0xA or 0)
0x080488e7 <score2+31>: pop %ebp ;Restore previous EBP
0x080488e8 <score2+32>: ret ;Return
Converting back into C, this might look something like:
int score2(int something) {
some_global_int -= 2;
if(some_global_int == something) return 0;
else return 0x0A;
}
Of course I only slapped this together in 5 minutes, and haven't double checked anything or tested anything, so it could be wrong.
After reading the above "score2" code, are you any closer to understanding the disassembly of any of the other functions?
Based on your initial attempt at commenting score2, you should either ask someone to do all the work for you (and learn nothing, and have no way of knowing if that person is right or wrong), or ask for the best place to learn 80x86 assembly (and AT&T syntax).
I'm assuming you're given some kind of compiled library with the score functions in it, and you're trying to reverse engineer it as some kind of homework project. In that case, I suggest you start familiarizing yourself with the standard C calling convention cdecl.
Basically, esp points to the stack, on which the arguments to the function are pushed before it's called, so a C function first moves esp into ebp and then it can access the arguments by subtracting values from ebp and dereferencing the resulting address. It uses ebp for this purpose so it can still modify esp in order to add more local variables on the stack without losing track of where the arguments are stored.
Anyway, here's an overview of score2 to help get you started:
(gdb) disas score2
Dump of assembler code for function score2:
0x080488c8 <score2+0>: push %ebp
0x080488c9 <score2+1>: mov %esp,%ebp ; This just saves a copy of the top of our stack to read arguments with
0x080488cb <score2+3>: mov 0x8049f88,%eax ; Load a value from a memory location (the number is a memory address, probably to a global variable)
0x080488d0 <score2+8>: sub $0x2,%eax ; Subtract 2
0x080488d3 <score2+11>: mov %eax,0x8049f88 ; Store the new value into the same memory location
0x080488d8 <score2+16>: cmp 0x8(%ebp),%eax ; Compare the first argument of the function to that value
0x080488db <score2+19>: setne %al ; Sets the lower byte of eax to 1 if they don't match
0x080488de <score2+22>: movzbl %al,%eax ; Sets al to eax, zeroing the upper bytes so eax is just 1 or 0 now
0x080488e1 <score2+25>: sub $0x1,%eax ; Subtract 1 from eax
0x080488e4 <score2+28>: and $0xa,%eax ; eax = eax & 0xa
0x080488e7 <score2+31>: pop %ebp
0x080488e8 <score2+32>: ret ; Return eax
So that means there is some kind of global variable stored at 0x8049f88 (let's call it x), and score2 literally translates to:
int score2(int n) {
x -= 2;
if (n == x)
n = 1;
else
n = 0;
n--;
n = n & 0xa;
return n;
}
EDIT: Brendan's example is the same, but probably looks more like the original code. Look over it a few times and compare it to the assembly output.
The next step is now to see what's in the variable at 0x8049f88. Try running awatch *0x8049f88
inside of gdb to make it stop on every access and also print *0x8049f88
to see what's stored there.
You should also run set disassembly-flavor intel
if you're not too familiar with assembly language. The syntax will then match the examples you're more likely to find on the Internet.
I presume you don't have access to the source code of the functions, and for the puzzle or homework you are supposed to try to find numbers to get a big score. You probably should edit your question to display contents of score.h, or just the relevant portions if it's quite lengthy. Also, note that disassembly at 0x8049f88 doesn't make sense. Instead use gdb's x command to display that location, and edit accordingly.
While you can attack the problem via disassembly (as above) you can also try using a different main program, that reports the results of individual score?() calls, and that loops some of them through a series of values looking for big values.
With score2(), looping within main() won't work, because score2() subtracts 2 from a word in memory. So, if you wanted to try out a lot of inputs, you'd need to call the program with different arguments in a shell code loop. Eg, if you are using bash:
for i in {1..1000}; do testScore2 $i; done
where testScore2 is a main program that only runs score2() with its parameter and reports the result.
Of course, because score2() can produce only two different results, as explained in detail in two previous answers, it won't actually make sense to test score2() with more than two argument values. I showed the shell code above because you might want to use such a technique with some of the other score functions.
精彩评论