My first exercise in assembly
I'm working under visual studio 2005 with assembly (I'm a newbie) and I want to create a prog开发者_高级运维ram that calculate arithmetic progression with that rule: An = 2*An-1 + An-2 but. I really don't know how to work with registers and I need just one example from you to continue with my exercises.
This is my code:
.386
.MODEL flat,stdcall
.STACK 4096
extern ExitProcess@4:Near
.data
arraysize DWORD 10
setarray DWORD 0 DUP(arraysize)
firstvar DWORD 1
secondvar DWORD 2
.code
_main:
mov eax,[firstvar]
mov [setarray+0],eax
mov eax,[secondvar]
mov [setarray+4],eax
mov ecx, arraysize ;loop definition
mov ax, 8
Lp:
mov eax,[setarray+ax-4]
add eax,[setarray+ax-4]
add eax,[setarray+ax-8]
mov [setarray+ax],eax
add ax,4;
loop Lp
add ax,4;
push 0 ;Black box. Always terminate
call ExitProcess@4 ;program with this sequence
end _main ;End of program. Label is the entry point.
You can't use ax as index register and eax as data register at the same time. For 32bit code, stick to 32 bit registers, unless you now what you are doing. You inadvertedly used a 16 Bit addressing mode, which you probably didn't want.
mov ecx, arraysize-1 ;loop definition
mov ebx, 8
Lp:
mov eax,[setarray+ebx-4]
add eax,[setarray+ebx-4]
add eax,[setarray+ebx-8]
mov [setarray+ebx],eax
add ebx,4
dec ecx
jnc Lp
Never ever use the loop instruction, even if some modern processors can execute ist fast (most can't).
I'm a beginner in assembler too, but my algorithm is a bit different:
A dword 1026 dup (0) ; declare this in the data segm.
; ...
mov esi, offset A ; point at the results array
mov [esi], 1 ; initialize A(0)
mov [esi + 4], 2 ; and A(1)
xor ecx, ecx
lp: add esi, 8
mov eax, [esi - 4] ; get A(n-1)
add eax, eax ; double it
add eax, [esi - 8] ; computes A(n)
mov [esi], eax ; and save it
inc ecx ; next n
cmp ecx, n ; be sure n is a dword, use
; cmp ecx, dword ptr n ; if it isn't
jb lp ; loop until ecx < n
;
; now you should have the results in the A array with
; esi pointing to the end of it
I didn't compiled it to see if works well but it should..
An = 2*An-1 + An-2 is almost the same formula as the Fibonacci sequence, so you can find a lot of useful stuff by searching for that. (e.g. this q&a). But instead of just an add, we need 2a + b, and x86 can do that in one LEA instruction.
You never need to store your loop variables in memory, that's what registers are for. So instead of each iteration needing to pull data back out of memory (~5 cycles latency for a round trip), it can just use registers (0 cycles extra latency).
Your array can go in .bss, rather than .data, so you aren't storing those zeros in your object file.
arraysize equ 10 ; not DWORD: this is an assemble-time constant, not a value stored in memory
.bss
seq DWORD 0 DUP(arraysize) ; I think this is the right MASM syntax?
; NASM equivalent: seq RESD arraysize
.code
_main:
mov edx, 1 ; A[0]
mov [seq], edx
mov eax, 2 ; A[1]
mov [seq+4], eax
mov ecx, 8 ; first 8 bytes stored
; assume the arraysize is > 2 and even, so no checks here
seqloop:
lea edx, [eax*2 + edx] ; edx=A[n], eax=A[n-1]
mov [seq + ecx], edx ; or edx=A[n-1], eax=A[n-2]
lea eax, [edx*2 + eax]
mov [seq + ecx + 4], eax
; unrolled by two, so we don't need any MOV or XCHG instructions between registers, or any reloading from memory.
add ecx, 8 ; +8 bytes
cmp ecx, arraysize*4 ; (the *4 happens at assemble time)
jb seqloop
ret
Using the array index for the loop condition means we only used 3 registers total, and still don't need to save/restore any of the usual call-preserved registers ESI, EDI, EBX, or EBP. (And of course the caller's ESP is restored, too).
If you care about performance, the loop is only 6 uops (fused-domain) on Intel SnB-family CPUs. For bigger arraysize, it could run at one result per clock (one iteration per 2 clocks).
精彩评论