More efficient way to output an integer in pure assembly
I'm looking to output an integer using pure assembly. I'm using nasm on a 64-bit linux machine. At the moment I'm looking for a way to output integers to debug a compiler, but I want to use the same code for writing an OS, which is also the reason I don't simply use p开发者_运维知识库rintf()
. After much searching and frustration I have come up with this code
SECTION .data
var: db " ",10,0
SECTION .text
global main
global _printc
global _printi
main:
mov rax, 90
push rax
call _printi
xor rbx, rbx
mov rax, 1
int 0x80
_printi:
pushf
push rax
push rbx
push rcx
push rdx
mov rax, [rsp+48]
mov rcx, 4
.start:
dec rcx
xor rdx, rdx
mov rbx, 10
div rbx
add rdx, 48
mov [var+rcx], dl
cmp rax, 0
jne .start
mov rax, [var]
push rax
call _printc
pop rax
pop rdx
pop rcx
pop rbx
pop rax
popf
ret
_printc:
push rax
push rbx
push rcx
push rdx
mov rax, [rsp+40]
mov [var], rax
mov rax, 4
mov rbx, 1
mov rcx, var
mov rdx, 4
int 0x80
pop rdx
pop rcx
pop rbx
pop rax
ret
Note that I'll be replacing 0x80 calls with BIOS calls when porting to OS development.
My question is how to optimize, or even prettify, this code further. My first thought would be to replace pushing all the registers individually, but there isn't any 64-bit pusha
instruction...
Here are some possible changes to the routine:
_printi:
pushf
push rax
push rbx
push rcx
push rdx
mov rax, [rsp+48]
mov rcx, 4
mov rbx, 10 ; --moved outside the loop
.start:
dec rcx
xor rdx, rdx
div rbx
add rdx, 48
mov [var+rcx], dl
cmp rax, 0
jne .start
; mov rax, [var] -- not used
; push rax -- not used
call _printc
; pop rax -- not used
pop rdx
pop rcx
pop rbx
pop rax
popf
ret
I also noted some limitations in the algorithm. If the number is larger than 9999, the code will continue to put digits outside of the allocated space, overwriting some other data. The routine is not fully reusable, i.e. if you print 123, then 9 it will come out as 129.
精彩评论