开发者

sum of bytes with unsigned long overflow in python

how to translate this piece of C code into Python >=2.6 ?

unsigned long memSum(unsigned char *p, unsigned long len)
{
   unsigned long i, sum=0;

   for(i=0; 开发者_开发问答i<len; i++) 
      sum = sum + *p++;

   return sum;
}  

of course

f=open("file_to_sum",'rb')
m = f.read()
f.close()
sum( array.array('B', m) )

does not work


If you need to wrap around on overflow, simply take your sum modulo MAX_LONG at the end.


A direct, Pythonic translation:

def memSum(data):
    return sum(ord(c) for c in data) & 0xFFFFFFFF


As I mentioned in my comment, you need to convert the string into a list of ints. This probably what you want:

f=open("infile.txt",'rb')
m=f.read()
f.close()
m=map(ord,list(m))
print sum(m)

The ord function returns the ascii int corresponding to a character.


I've tried your code, and seems to be working (it opens the data in binary, convert it to a list of unsigned char and adds all).

What is your problem? Could be an overflow problem? Maybe there is a problem with the length? How are you saving the file? Sorry, but with this information we can only guess!

The code is not equivalent, as the Python code seems to deal with files and the C code seems to deal with a memory array.


You really need to give an example input and what you expect as output. Any code that uses a recent version of Python, extracts an integer in range(256) from each byte, sums those integers, and finally does total &= 0xFFFFFFFF should do the job (assuming that your unsigned long is 32 bits wide).

Note that the last step ( the &=) is pointless if your file is less than about 16MB in size ... it won't overflow; 16843009 * 255 <= 0xFFFFFFFF < (16843009 + 1) * 255

That means that if your test file is smaller than 16843010 bytes, you must have a problem in your C code or your Python code or both.

You said that "of course" this code:

f=open("file_to_sum",'rb')
m = f.read()
f.close()
sum( array.array('B', m) )

"doesn't work". Does it work if you replace the last line by

print sum( array.array('B', m) )

?

If none of the above is of any help, and you want sensible answers instead of guesses, provide example input, expected output, C code, C output, Python code, Python output. Both the C code and the Python code should be standalone-runnable, and should include printing the size of the byte array being summed.


first, thank you all for your help

I have written successfully the C implementation and tested over 20 different files the checksum function. And then the 2 complement of the computed checksum is compared to a checksum a the file. It works perfectly in C as mentionned: with unsigned char and unsigned long.

the right value of the sum is 3da4be70 (from C implementation) from your suggestions:

print '%x' % ( sum( array.array('B', m[ o2+4:o2+len2 ]) ) )

and

n = map(ord, list(m[ o2+4: o2+len2 ]))
print '%x' % (sum(n))

give 504a022a

I suppose it is because bytes here in python as interpreted as signed instead of unsigned (like in C)...

update:

even

for i in range(o2+4, o2+len2):
 cchk = ( (cchk + unpack('B', m[i])[0]) & 0xffffffff)
print '%x' % (cchk)

does not work and gives again 504a022a


SOLVED : MY FAULT, SORRY

#include<stdio.h>

unsigned long memSum(unsigned char *p, unsigned long len)
{
   unsigned long i, sum=0;

   for(i=0; i<len; i++) 
      sum = sum + *p++;

   return sum;
}  

#define LEN2SUM (0xa13b10-4)

int main(int argc, char *argv[] )
{

  FILE *f;
  unsigned char *buf;
  unsigned long sum;

  f=fopen("test2.dat", "rb");
  fseek(f, 0x7c+4, SEEK_SET); 

  buf = (unsigned char*)malloc(LEN2SUM);
  fread(buf, sizeof(char), LEN2SUM, f);
  sum = memSum( buf, LEN2SUM);
  printf("0x%08x\n", sum );

  free(buf);  
  fclose(f);


}

and

f = open('test2.dat','rb')
f.seek(0x7c+4)

m = f.read(0xa13b10-4)
print '%x' % ( ( sum(ord(c) for c in m) & 0xFFFFFFFF ) )

give the same answer, the good one

the difference is that in C, i checksum a given memory area which contains decrypted data, where decryption has been done 'in place'

in my python implementation, decryption is done in another buffer, and I still checksum the encrypted area.

since my a beginner in python, I was focused on this point : bad track. i'm kicking my ass twenty times.....

sorry for the stupid question and thanks again your kind help !!!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜