Dividing a file up with integer math
I'm reading a file from n servers, and I want each to download 1/nth of the file. I thought some quick integer math would work, but it doesn't seem to always work:
threads = n
thread_id = 0:n-1
filesize (in bytes) = x
starting position = thread_id*(filesize/threads)
bytes to read = (filesize/threads)
开发者_如何转开发
Sometimes for just the right numbers, like a 26 byte file divided up by 9 threads (I know that's ridiculous but just for example), it doesn't work out in my favor. There must be a better way. Any ideas?
It seems to me the only thing missing is the last thread (thread n-1
) must read to the end of the file to grab the 'modulus' bytes - the bytes that were left over by dividing by threads
. Basically:
bytes_to_read = (thread_id == n - 1) ? filesize / threads + filesize % threads
: filesize / threads
Alternately you could split up this extra work over the first filesize % threads
threads, by adding 1 byte per thread to the bytes_to_read - of course you'll have to adjust the starting positions.
you have to do something like:
starting position = thread_id * floor(filesize / threads)
bytes to read = floor(filesize / threads) if thread_id != threads-1
bytes to read = filesize - (threads-1)*floor(filesize / threads) if thread_id = threads - 1
To read each byte exactly once, compute the start and end position consistently, then subtract to get the number of bytes:
start_position = thread_id * file_size / n
end_position = (thread_id + 1) * file_size / n
bytes_to_read = end_position - start_position
Note that the position expression is carefully chosen to give you end_position == file_size
when thread_id == n-1
. If you do something else, like thread_id * (file_size/n)
, you will need to treat this as a special case, like @wuputah says.
精彩评论