Smallest method of turning a string into an integer(and vice-versa)
I am looking for an extremely small way of turning a string like "123"
into an integer like 123
and vice-versa.
I will be working in a freestanding environment. This is NOT a premature optimization. I am creating code that must fit in 512 bytes, so every byte does actually count. I will take both 开发者_如何学Cx86 assembly(16 bit) and C code though(as that is pretty easy to convert)
It does not need to do any sanity checks or anything..
I thought I had seen a very small C implementation implemented recursively, but I can't seem to find anything for size optimization..
So can anyone find me(or create) a very small atoi/itoa implementation? (it only needs to work with base 10 though)
Edit: (the answer) (edited again because the first code was actually wrong) in case someone else comes upon this, this is the code I ended up creating. It could fit in 21 bytes!
;ds:bx is the input string. ax is the returned integer
_strtoint:
xor ax,ax
.loop1:
imul ax, 10 ;ax serves as our temp var
mov cl,[bx]
mov ch,0
add ax,cx
sub ax,'0'
inc bx
cmp byte [bx],0
jnz .loop1
ret
Ok, last edit I swear! Version weighing in at 42 bytes with negative number support.. so if anyone wants to use these they can..
;ds:bx is the input string. ax is the returned integer
_strtoint:
cmp byte [bx],'-'
je .negate
;rewrite to negate DX(just throw it away)
mov byte [.rewrite+1],0xDA
jmp .continue
.negate:
mov byte [.rewrite+1],0xD8
inc bx
.continue
xor ax,ax
.loop1:
imul ax, 10 ;ax serves as our temp var
mov dl,[bx]
mov dh,0
add ax,dx
sub ax,'0'
inc bx
cmp byte [bx],0
jnz .loop1
;popa
.rewrite:
neg ax ;this instruction gets rewritten to conditionally negate ax or dx
ret
With no error checking, 'cause that's for wussies who have more than 512B to play with:
#include <ctype.h>
// alternative:
// #define isdigit(C) ((C) >= '0' && (C) <= '9')
unsigned long myatol(const char *s) {
unsigned long n = 0;
while (isdigit(*s)) n = 10 * n + *s++ - '0';
return n;
}
gcc -O2
compiles this into 47 bytes, but the external reference to __ctype_b_loc
is probably more than you can afford...
I don't have an assembler on my laptop to check the size, but offhand, it seems like this should be shorter:
; input: zero-terminated string in DS:SI
; result: AX
atoi proc
xor cx, cx
mov ax, '0'
@@:
imul cx, 10
sub al, '0'
add cx, ax
lodsb
jnz @b
xchg ax, cx
ret
atoi endp
Write it yourself. Note that subtracting '0' from a digit gets the power-of-ten. So, you loop down the digits, and every time you multiply the value so far by 10, subtract '0' from the current character, and add it. Codable in assembly in no time flat.
atoi(p) register char *p; { register int n; register int f;
n = 0;
f = 0;
for(;;p++) {
switch(*p) {
case ' ':
case '\t':
continue;
case '-':
f++;
case '+':
p++;
}
break;
}
while(*p >= '0' && *p <= '9')
n = n*10 + *p++ - '0';
return(f? -n: n);
}
And here is another one without any checking. It assumes a null terminated string. As a bonus, it checks for a negative sign. This takes 593 bytes with a Microsoft compiler (cl /O1).
int myatoi( char* a )
{
int res = 0;
int neg = 0;
if ( *a == '-' )
{
neg = 1;
a++;
}
while ( *a )
{
res = res * 10 + ( *a - '0' );
a++;
}
if ( neg )
res *= -1;
return res;
}
Are any of the sizes smaller if you use -Os (optimize for space) instead of -O2 ?
You could try packing the string into BCD(0x1234) and then using x87 fbld and fist instructions for a 1980s solution but I am not sure that will be smaller at all as I don't remember there being any packing instruction.
How in the world are you people getting the executables so small?! This code generates a 316 byte .o file when compiled with gcc -Os -m32 -c -o atoi.o atoi.c
and a 8488 byte executable when compiled and linked (with an empty int main(){}
added) with gcc -Os -m32 -o atoi atoi.c
. This is on Mac OS X Snow Leopard...
int myatoi(char *s)
{
short retval=0;
for(;*s!=0;s++) retval=retval*10+(*s-'0');
return retval;
}
精彩评论