How bad it is to keep calling malloc() and free()?
I'm sending a text file - client-server breakup the text into packets each of 512 bytes but some packets contain text less than max size so on the servers side when receiving each packet I'm calling malloc() to build a string again , is this a bad practice ? is it better to keep a working buffer that can fit for max length and keep iterating , copying and overwriting its values ?
okay @n.m. here is the code , this if is inside a for(;;) loop wakened by the select()
if(nbytes==2) {
packet_size=unpack_short(short_buf)开发者_如何学Go;
printf("packet size is %d\n",packet_size);
receive_packet(i,packet_size,&buffer);
printf("packet=%s\n",buffer);
free(buffer);
}
//and here is receive_packet() function
int receive_packet(int fd,int p_len,char **string) {
*string = (char *)malloc(p_len-2); // 2 bytes for saving the length
char *i=*string;
int temp;
int total=0;
int remaining=p_len-2;
while(remaining>0) {
//printf("remaining=%d\n",remaining);
temp = recv(fd,*string,remaining,0);
total+=temp;
remaining=(p_len-2)-total;
(*string) += temp;
}
*string=i;
return 0;
}
In your example, your function already contains a syscall, so the relative cost of malloc
/free
will be virtually unmeasurable. On my system, a malloc
/free
"round trip" averages about 300 cycles, and the cheapest syscalls (getting the current time, pid, etc.) cost at least 2500 cycles. Expect recv
to easily cost 10 times that much, in which case the cost of allocating/freeing memory will be at most about 1% of the total cost of this operation.
Of course exact timings will vary, but the rough orders of magnitude should be fairly invariant across systems. I would not even begin to consider removing malloc
/free
as an optimization except in functions that are purely user-space. Where it's probably actually more valuable to go without dynamic allocation is in operations that should not have failure cases - here the value is that you simplify and harden you code by not having to worry about what to do when malloc
fails.
There is overhead associated with calling malloc and free. A block has to be allocated from the heap and marked as in use, when you free the revese happens. Not knowing what OS or complier you are using, this could be in the c library or at the OS memory managment level. Since you are doing a lot of mallocs and frees you could wind up severly fragmenting your heap where you may not have enough contiguous free memory to do a malloc elsewhere. If you can allocate just one buffer and keep reusing it, that is generally going to be faster and have less danger of heap fragmentation.
I have found that malloc, realloc and free are pretty expensive. If you can avoid malloc it is better to reuse the memory that you've already got.
Edit:
It looks like I am wrong about how expensive malloc is. Some timing tests with the GNU C Library version 2.14 on Linux show that for a test that loops 100,000 times and allocates and frees 512 slots with random sizes from 1 to 163840 bytes:
tsc average loop = 408
tsc of longest loop = 294350
So wasting 408 cycles doing malloc
or new
in a tight inner loop would be a silly thing to do. Other than that don't bother worrying about it.
Malloc is generally fairly inexpensive. It is only expensive if it generates a syscall to get more heap space. For instance, in UNIX-Like systems it will eventually generate an sbrk call which will be expensive. If you repeatedly malloc and free the same size of memory, it will do it crazy fast. For instance, consider the following little test program:
#include <stdlib.h>
int main()
{
int i=0;
int *ptr;
for(int i=0; i<1e6; i++) {
ptr = malloc(1024*sizeof(int));
free(ptr);
}
}
It allocates 1024 integers and frees them and does this one million times. Running this on my rather modest little chromebook turned linux machine, I get timings that look like this:
time ./test
real 0m0.125s
user 0m0.122s
sys 0m0.003s
Now, if I comment out the malloc and free part of the loop, I get these timings:
time ./test
real 0m0.009s
user 0m0.005s
sys 0m0.005s
So you see that malloc and free does have overhead, though I think that being just over ten times the overhead of doing nothing is terribly much overhead.
It is especially fast if it can just keep reusing the same chunk of heap over and over again (as is the case here). Of course, if I kept repeatedly allocating and growing the program, it would take more time because that would result in a few syscalls.
Of course, your mileage may vary depending on OS, compiler, and stdlib implementation.
Calling multiple malloc/free can actually increase the memory used by your process (without any leaks involved), if the size passed to malloc is variable, as proven by this question:
C program memory usage - more memory reported than allocated
So the single buffer approach is probably best.
Only testing can tell. When programming in C I do err on the side of avoiding malloc though, since memory leaks can be quite hard to fix if you create one by accident.
Measure the performance of the two solutions. Either by profiling or measuring throughput. It's impossible to say anything for certain.
精彩评论