CPU cost order of magnitude for some basic operations
After answering to that SO question and being donwvoted I'd like to check something with you.
So as to have a draft idea of the cost of the code I write, I have the tendency to scale operations this way.
- Heap allocation is around 1000 times slower t开发者_如何学编程han Stack allocation.
- IO with screen/output is around 1000 times slower than Heap allocation.
- IO on harddrive is around 1000 times slower than graphical IO on screen.
Do you think this is correct assumption / order of magnitude / estimation ?
(And of course, there is nothing like a real profiling of the application :-))
EDIT: as a first conclusion according to your answers and comment, one may say that my figure 1000 is largely overestimated.
If you're going to make massive generalisations like that, you might want to think about having hard data to back them up.
I don't doubt that you're right about the relative efficiencies on most architectures (I say most simply because there may be some weird architectures out there that I don't know about) but the 1000x
ratios are suspect without proof.
And actually, I'm not that certain about the relative efficiency of screen and disk I/O since it can be affected by buffering. I've often found that a program outputting thousands of lines to the screen runs faster when directing the output to a disk file.
For example, the following program:
#include <stdio.h>
int main (void) {
int i;
for (i = 100000; i > 0; i--)
printf ("hello\n");
return 0;
}
runs as:
pax$ time myprog
hello
hello
:
hello
real 0m12.861s
user 0m1.762s
sys 0m2.002s
pax$ time ./myprog >/tmp/qq
real 0m0.191s
user 0m0.160s
sys 0m0.050s
In other words, screen I/O in that environment (CygWin under XP) takes 67 times as long in duration and 17 times as long in CPU time (presumably because of all the windows updates).
Here's another quick and interesting, if not scientifically reliable and not well thought out test:
char *memory;
NSLog (@"Start heap allocs");
for (int allocations = 0; allocations < 100000000; allocations++)
{
memory = malloc (1024);
memory[0] = 1;
memory[1023] = memory[0] + 1;
free(memory);
}
NSLog (@"End heap allocs");
NSLog (@"Start stack allocs");
for (int allocations = 0; allocations < 100000000; allocations++)
{
char memory2 [1024];
memory2[0] = 1;
memory2[1023] = memory2[0] + 1;
}
NSLog (@"End stack allocs");
and the output:
2011-02-12 11:46:54.078 Veg Met Chilli[4589:207] Start heap allocs
2011-02-12 11:47:06.759 Veg Met Chilli[4589:207] End heap allocs
2011-02-12 11:47:06.759 Veg Met Chilli[4589:207] Start stack allocs
2011-02-12 11:47:07.057 Veg Met Chilli[4589:207] End stack allocs
Do the maths yourself, but that makes heap allocs about 42 times longer. I must stress DO NOT QUOTE ME on that, there are bound to be flaws in it! Most notably the relative time it takes to actually assign values into the data.
EDIT: New test data.
So now I'm simply calling a method for each heap and stack alloc, rather than having them immediately in the loop. Results:
2011-02-12 12:13:42.644 Veg Met Chilli[4678:207] Start heap allocs
2011-02-12 12:13:56.518 Veg Met Chilli[4678:207] End heap allocs
2011-02-12 12:13:56.519 Veg Met Chilli[4678:207] Start stack allocs
2011-02-12 12:13:57.842 Veg Met Chilli[4678:207] End stack allocs
This makes heap allocs only about 10 times as long as stack allocs. To make the results more accurate, I should also have a control method which does no memory allocation (but at least does something so as not to get optimised out), and take away that time. I'll do that next...
EDIT: Right... Now the code looks like this:
int control = 0;
NSLog (@"Start heap allocs");
for (int allocations = 0; allocations < 100000000; allocations++)
{
control += [self HeapAlloc];
}
NSLog (@"End heap allocs");
NSLog (@"Start stack allocs");
for (int allocations = 0; allocations < 100000000; allocations++)
{
control += [self StackAlloc];
}
NSLog (@"End stack allocs");
NSLog (@"Start no allocs");
for (int allocations = 0; allocations < 100000000; allocations++)
{
control += [self NoAlloc];
}
NSLog (@"End no allocs");
NSLog (@"%d", control);
-(int) HeapAlloc
{
int controlCalculation = rand();
char *memory = malloc (1024);
memory[0] = 1;
memory[1023] = memory[0] + 1;
free(memory);
return controlCalculation;
}
-(int) StackAlloc
{
int controlCalculation = rand();
char memory [1024];
memory[0] = 1;
memory[1023] = memory[0] + 1;
return controlCalculation;
}
-(int) NoAlloc
{
int controlCalculation = rand();
return controlCalculation;
}
and the results are:
2011-02-12 12:31:32.676 Veg Met Chilli[4816:207] Start heap allocs
2011-02-12 12:31:47.306 Veg Met Chilli[4816:207] End heap allocs
2011-02-12 12:31:47.306 Veg Met Chilli[4816:207] Start stack allocs
2011-02-12 12:31:49.458 Veg Met Chilli[4816:207] End stack allocs
2011-02-12 12:31:49.459 Veg Met Chilli[4816:207] Start no allocs
2011-02-12 12:31:51.325 Veg Met Chilli[4816:207] End no allocs
So the control time is 1.866 seconds. Take that away from the alloc times gives: stack 0.286 seconds heap 12.764 seconds
So heap allocations take about 45 times as long as stack allocations.
Thank you and good night! :)
The 1st point depends on a lot of things, really. If you run out of memory, then allocating something on the heap may take literally minutes. The stack, on the other hand, may already be allocated at that point.
The 2nd point depends on the terminal being used. Outputting to a DOS screen is one thing, outputting to a Windows console window is another, and xterm is something entirely different from them too.
As for the 3rd point, I'd rather say it's the other way around for modern hard disks. They can easily handle megs per second, how can you imagine outputting that much to any terminal in such short time? For small amounts of data you may be right, though, as hard drive I/O may take some time to prepare.
精彩评论