My first exercise in assembly

2023-01-26 02:28 问答作者：

I'm working under visual studio 2005 with assembly (I'm a newbie) and I want to create a prog开发者_高级运维ram that calculate arithmetic progression with that rule: A_n = 2*A_n-1 + A_n-2 but. I really don't know how to work with registers and I need just one example from you to continue with my exercises.

This is my code:

.386
.MODEL flat,stdcall

.STACK 4096

extern ExitProcess@4:Near

.data                               
arraysize DWORD 10

setarray  DWORD 0 DUP(arraysize)
firstvar  DWORD 1
secondvar DWORD 2

.code                               
_main:                              
mov eax,[firstvar]
mov [setarray+0],eax        
mov eax,[secondvar]
mov [setarray+4],eax

mov ecx, arraysize              ;loop definition
mov ax, 8

Lp:
mov eax,[setarray+ax-4]
add eax,[setarray+ax-4]
add eax,[setarray+ax-8]
mov [setarray+ax],eax

add ax,4;
loop Lp

add ax,4;

    push    0                   ;Black box. Always terminate
    call    ExitProcess@4       ;program with this sequence

    end   _main              ;End of program. Label is the entry point.

You can't use ax as index register and eax as data register at the same time. For 32bit code, stick to 32 bit registers, unless you now what you are doing. You inadvertedly used a 16 Bit addressing mode, which you probably didn't want.

mov ecx, arraysize-1              ;loop definition
mov ebx, 8

Lp:
mov eax,[setarray+ebx-4]
add eax,[setarray+ebx-4]
add eax,[setarray+ebx-8]
mov [setarray+ebx],eax

add ebx,4
dec ecx
jnc Lp

Never ever use the loop instruction, even if some modern processors can execute ist fast (most can't).

I'm a beginner in assembler too, but my algorithm is a bit different:


    A   dword   1026 dup (0)          ; declare this in the data segm.

;       ...

    mov     esi, offset A         ; point at the results array
    mov     [esi], 1               ; initialize A(0)
    mov     [esi + 4], 2           ;  and A(1)
    xor  ecx, ecx


lp:     add     esi, 8

        mov eax, [esi - 4]          ; get A(n-1)
        add eax, eax                ; double it
        add eax, [esi - 8]          ; computes A(n)
        mov [esi], eax              ; and save it
        inc ecx                     ; next n
        cmp ecx, n                  ; be sure n is a dword, use
;       cmp ecx, dword ptr n        ; if it isn't
        jb      lp                     ; loop until ecx < n
;
;       now you should have the results in the A array with
;       esi pointing to the end of it

I didn't compiled it to see if works well but it should..

A_n = 2*A_n-1 + A_n-2 is almost the same formula as the Fibonacci sequence, so you can find a lot of useful stuff by searching for that. (e.g. this q&a). But instead of just an add, we need 2a + b, and x86 can do that in one LEA instruction.

You never need to store your loop variables in memory, that's what registers are for. So instead of each iteration needing to pull data back out of memory (~5 cycles latency for a round trip), it can just use registers (0 cycles extra latency).

Your array can go in .bss, rather than .data, so you aren't storing those zeros in your object file.

arraysize equ 10     ; not DWORD: this is an assemble-time constant, not a value stored in memory

.bss
seq  DWORD 0 DUP(arraysize)   ; I think this is the right MASM syntax?
; NASM equivalent:  seq RESD arraysize

.code
_main:

    mov  edx, 1         ; A[0]
    mov  [seq], edx
    mov  eax, 2         ; A[1]
    mov  [seq+4], eax

    mov  ecx, 8         ; first 8 bytes stored
    ; assume the arraysize is > 2 and even, so no checks here
seqloop:
    lea  edx, [eax*2 + edx]  ;    edx=A[n],   eax=A[n-1]
    mov  [seq + ecx], edx    ; or edx=A[n-1], eax=A[n-2]
    lea  eax, [edx*2 + eax]  
    mov  [seq + ecx + 4], eax
    ; unrolled by two, so we don't need any MOV or XCHG instructions between registers, or any reloading from memory.

    add  ecx, 8             ; +8 bytes
    cmp  ecx, arraysize*4   ; (the *4 happens at assemble time)
    jb   seqloop

    ret

Using the array index for the loop condition means we only used 3 registers total, and still don't need to save/restore any of the usual call-preserved registers ESI, EDI, EBX, or EBP. (And of course the caller's ESP is restored, too).

If you care about performance, the loop is only 6 uops (fused-domain) on Intel SnB-family CPUs. For bigger arraysize, it could run at one result per clock (one iteration per 2 clocks).

继续阅读：assembly masm x86

My first exercise in assembly

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？