Nested Loops in Big-O Notation?
Maybe I'm mistaken in my understanding of Big-O notation (it has been a while since I've taken a course on algorithms) but the following has never made too much sense to me:
This would be considered O(n^2):
for (int i = 0; i &l开发者_开发知识库t; num_1; i++)
{
for (int j = 0; j < num_2; j++)
{
cout << i << " " << j << endl;
}
}
This would be considered O(n):
for (int z = 0; z < num_3; z++) { cout << z << endl; }
My issue is when it comes to practical terms. Lets assume that num_1 = 10; num_2 = 20; num_3 = 1000;
. In this case the first example, an O(n^2), would run considerably less iterations of it's interior than the O(n) second example.
In more general terms: when num_3 > num_1 * num_2
then the O(n^2) snippet does less than the O(n) snippet. In real world applications, these two snippets may be doing two very separate tasks where there are functional bounds on num_1
, num_2
, and num_3
are considerably different. The nested num_1
and num_2
may be looping variable values between 0 and 255 but num_3
may frequent values above a million.
Why should/would a coder trust an algorithm or snippet based on its Big-O notation when it doesn't take into consideration the practical or operational variable boundaries?
Saying that something is in O(n^2)
only makes sense if it is clear what `n´ is supposed to be. Usually it refers to the size of the input (or if the input is a number, it just refers to that number), but in your code, it's not clear what the input is.
for (int i = 0; i < num_1; i++)
{
for (int j = 0; j < num_2; j++)
{
cout << i << " " << j << endl;
}
}
Normally one would say that the above snippet's running time is in O(num_1 * num_2)
. If num_1
and num_2
are both constants, this means it is in O(1)
. If both num_1
and num_2
are linearly proportional to the size of your program's input (n
), it is indeed O(n^2)
. If both num_1
and num_2
are proportional to the square of the size of the input, it is in O(n^4)
.
Bottom line: it depends entirely on what num_1
and num_2
are and how and depending on what factors they grow.
for (int z = 0; z < num_3; z++) { cout << z << endl; }
Now this code is in O(num_3)
. To say what this is in terms of n
would again require us to know how num_3
is related to n
.
If all of num_1
, num_2
and num_3
are linearly proportional to n
, then you can indeed say that the first snippet runs in O(n^2)
time and the second in O(n)
. However in that case it is not possible for num_3
to be greater than num_1 * num_2
for sufficiently large n
.
Big O describes algorithmic speed, not actual code.
When you have a generic algorithm you don't know what the constraints on the variables are.
The big-Oh notation is a way to express computational complexity as a rate of growth function. It absolutely has to be understood that it is an approximation and one that really only bares out for large values of the variables involved (eg. N).
You are absolutely correct that individual variable values, constants, and such, have a big impact.
However, for similar (and large) sized variable values, the big-Oh expression for a set of algorithms will give an indication of their relative performance. More typically though, it is a convenient implementation independent way to express the best, average, and worst case complexities for algorithms.
At the end of the day though, once a short list of candidate algorithms have be selected (probably based on the big-oh notation, and other characteristics eg. space requirements etc) then timing an implementation with a representative dataset is the way to go.
Big O notation only says how long will algorithm work for data of given magnitude, and how will it "scale" when you get more data, O(n) algorithm can be slower if it gets more data than O(n^2) algorithm (as you've shown with your example). But if you feed 2 times more data to an O(n) algorithm you should expect 2 times longer running time, with O(n^2) you should expect 4 times longer.
You can think of Big O for those examples more in the terms of as N approaches infinity.
So you're right in your scenario that num_3 > num_1 * num_2, but as those three numbers get larger and larger, that will no longer hold true.
If algorithm1 is O(N) and algorithm2 is O(N^2) it does NOT mean that algorithm1 is ALWAYS better than algorithm2, it just means that there is some threshold value for N (called N0 usually) where after that point algorithm1 will perform better than algorithm2.
A random example is that insertion sort is O(N^2) where MergeSort is O(N*log(N)), but for very small values of N, insertion sort can actually end up being faster. Once N gets big enough though, MergeSort is always faster. The Java version of the Arrays.sort
function actually has an if
statement that uses insertion sort for very small values of N and a modified quick sort or merge sort for anything bigger than a certain size (the magic number is about N=7).
The Java code (in Java 6) for Arrays.sort
for an int
array looks like this:
private static void sort1(int x[], int off, int len) {
// Insertion sort on smallest arrays
if (len < 7) {
//insertion sort
}
//modified quick sort
}
At the end of the day, Big O notation is a triage mechanism that helps you quickly analyze & compare algorithms in a way that is independent of computer hardware and doesn't require you to write, test and time the execution of your various algorithms. It's a simplified notation, so it's never going to be exact and as the example I just gave shows, it is very dependent on the size and range of your data.
A major caveat to Big O notation for algorithm is that you can often make improvements to an algorithm if you can make assumptions about your data.
Big-O gives you the upper bound or worst-case growth rate. Constants are ignored because as n grows they become more and more insignificant (eg. instead of say O(3+2n) you would just say O(n) ).
Big-Omega is a best-case growth rate and depending on what you know about how your algorithm will be used may be more appropriate for you to use in some situations.
If Big-O and Big-Omega for a given algorithm are the same then in is called exact order and you can right that as Big-Theta.
Edit: To clarify, worst-case analysis is often preferable because you want to be able to tell a client "it will always perform this well or better" instead of "if your data happens to be perfect it will perform great!"
精彩评论