开发者

Upper bound vs lower bound for worst case running time of an algorithm

I am learni开发者_StackOverflow社区ng about analysis of algorithms. I understand the concept of the worst case running time of an algorithm.

However, what are upper and lower bounds on the worst case running time of an algorithm?

What can be an example where an upper bound for the worst case running time of an algorithm is different from the lower bound for the worst case running time of the same algorithm?


For a function f(n), g(n) is an upper bound (big O) if for "big enough n", f(n)<=c*g(n), for a constant c. [g dominates f]
g(n) is lower bound (big Omega) if for "big enough n", f(n) >= c*g(n), for a constant c. [f dominates g]

If g(n) is both upper bound and lower bound of f(n) [with different c's], we say g(n) is a tight bound for f(n) [Big theta]

Use example for upper bound instead of tight one: Sometimes, it is hard to find tight bound, such as the fibonacci recursive algorithm. So we find an easy upper bound of O(2^n) easily. more info is found in answers in this post.

How does it related to worst/base/... cases? (as requested by comments):

Worst case/average case (or any other case) affects what the complexity function is, but big-O, big-Omega, and big-Theta can be applied to each of these cases.

For example, a HashTable insert is Θ(1) average case insertion, and Θ(n) worst case insertion. It is also O(n) average case insertion (bound is not tight), and Ω(1) worst case insertion.


First, let's talk about cases. A case of input for an algorithm is associated with an instance of a problem. For the sorting problem (where we want to find a permutation of a set in a specific order), I can look at an instance like the set of numbers {1, 5, 4, 2, 6}. This set of numbers would be the input to a sorting algorithm that purports to solve the sorting problem, like Selection Sort, or one of the other sorting algorithms out there. 

The same sets of inputs can be given to any algorithm that wants to solve a problem. It doesn't matter what sorting algorithm I use, the set of inputs is always the same (because, by definition, they're all instances of the same problem). However, a given case can be better or worse for a given algorithm. Some algorithms always perform the same no matter what the inputs are, but some algorithms might do worse on some inputs. However, this means that every algorithm has some best case and some worst case; we also sometimes talk about the average case (by taking the average of all the cases) or the expected case (when we have some reason to expect that one case will be more common than others).

Algorithm Case Examples

The problem of "find the minimum of an unsorted list" always works the same for every possible input. No matter what clever algorithm you write, you have to check every element. It doesn't matter if you have a list of zeros or a list of random numbers or a list where the first element is the minimum, you don't know until you get to the end. Every case is the same for that algorithm, so the best case is the worst case, and also the average case and the expected case. If the list was sorted, we could do better, but that's a different problem.

The problem of "find a given element in a list" is different. Assuming you were using an algorithm that does a linear walk through the list, it might turn out that the given element was the first element of the list and you're done immediately. However, it might also be the last element of the list, in which case you have to walk the whole thing before you find it. So there you had a best case and a worst case.

Algorithms as Functions of Input Size

When we want to analyze an algorithm, us algorists think about every possible case we could throw at the algorithm. Usually, the two most interesting cases are the best case and the worst case. If you think of the algorithms runtime as a function of its input, the best case is the input that minimizes the function and the worst case is the input that maximizes the function. I'm using "function" in the Algebra math sense here: a series of x/y pairs (input/output pairs, or in this case "input size/number of execution steps") that draw a line.

Because algorithms' runtime is a function of its input, we have a different best case (and worst case) for each possible input size. So sometimes we treat the best case as a single input, but it's really a set of inputs (one for each input size). The best case and worst case are very concrete things with respect to a given algorithm.

Bounds

Now what about bounds? Bounds are functions that we use to compare against a given algorithm's function. There are an infinite number of boundary functions we could consider. How many possible kinds of lines can you draw on a graph? That's how many boundary functions there are. Most algorists are usually only interested in a few specific functions: things like the constant function, the linear function, the logarthmic function, the exponential function, etc.

An upper bound is a function that sits on top of another function. A lower bound is a function that sits under the other function. When we talk about Big O and Big Omega, we don't care if the bounds are ALWAYS above or below the other function, just that after a certain point they always are (because sometimes algorithms get weird for small input sizes). 

There are an infinite number of possible upper bounds for any given function, and an infinite number of possible lower bounds for any given function. But this is one of those weird times when we're talking about different sizes of infinities. To be an upper bound, the function must not be below the other function, so we rule out the infinite number of functions below the other function (so it's smaller than the set of all possible functions).

Of course, just because there are infinite upper bounds, doesn't mean they're all useful. The function f(∞) is an upper bound for every function, but that's like saying "I have less than an infinite number of dollars" - not particularly useful for figuring out if I'm penniless or a millionaire. So we are often interested in an upper bound that is "tight" (also known as a "least upper bound" or "supremum"), for which there is no better upper bound.

Best/Worst Case + Lower/Upper Bound

We have best/worst cases that represent the upper and lower functions of an algorithms' runtime function. We have upper and lower bounds that represent other functions that could be on top or below (respectively) any other function. They can be combined to articulate key ideas about algorithms.

Worst Case Lower Bound: A function that is a boundary below the algorithms' runtime function, when that algorithm is given the inputs that maximize the algorithm's run time.

Worst Case Upper Bound: A function that is a boundary above the algorithms' runtime function, when that algorithm is given the inputs that maximize the algorithm's run time.

Best Case Lower Bound: A function that is a boundary below the algorithms' runtime function, when that algorithm is given the inputs that minimize the algorithm's run time.

Best Case Upper Bound: A function that is a boundary above the algorithms' runtime function, when that algorithm is given the inputs that minimize the algorithm's run time.

Examples of Case Bounds

Let's give concrete examples of when we might care about each of these:

Worst Case Lower Bound: The classic example here is comparison-based sorting, which is famously known to be Ω(n log(n)) in the worst case. No matter what algorithm you devise, I can pick a set of worst-case inputs whereby the tightest lower bound function is log-linear. You cannot make an algorithm that beats that bound for the worst case, and you shouldn't bother trying. It's the basement of sorting. Of course, there are many lower bounds for the worst case: constant, linear, and sublinear are all lower bounds. But they are not useful lower bounds, because there the log-linear lower bound is the tightest one.

Best Case Lower Bound: Insertion Sort works by walking through the list, and inserting any out-of-order it comes across in the right place. If the list is sorted, it will only need to walk through the list once without doing any inserts. This means that the tightest lower bound of the best case is Ω(n). You cannot do better than that without sacrificing correctness, because you still need to be able to walk through the list (linear time). However, the lower bound for the best case is better than the lower bound for the worst case!

Worst Case Upper Bound: We are often interested in finding a tight upper bound on the worst case, because then we know how poorly our algorithm can run in the worst of times. Insertion sort's worst case is a list that is completely out of order (i.e. completely reversed from its correct order). Every time we see a new item, we have to move it to the start of the list, pushing all subsequent items forward (which is a linear time operation, and doing it a linear number of times leads to quadratic behavior). However, we still know that this insertion behavior will be O(n2) in the worst case, acting as a tight upper bound for the worst case. It's not great, but it's better than an upper bound of, say, exponential or factorial! Of course, those are valid upper bounds for the worst case, but again that's not as useful as knowing that quadratic is a tight upper bound.

Best Case Upper Bound: What's the worst our algorithm can do in the best of times? In example before of finding an element in a list, where the first element was our desired element, the upper bound is O(1). In the worst case it was linear, but in the best case, the worst that can happen is that it's still constant. This particular idea isn't usually as important as Worst Case Upper Bound, in my opinion, because we're usually more concerned with dealing with the worst case, not the best case.

Some of these examples are actually Ө, not just O and Ω. In other cases, I could have picked lower or upper bound functions that weren't tight, but were still approximate enough to be useful (remember, if we're not being tight, I have an infinite well to draw from!). Note that it can be difficult to find compelling examples of different case/bound combinations, because the combinations have different utility.

Misconceptions and Terminology

Frequently, you'll see people with misconceptions about these definitions. In fact, many perfectly good Computer Scientists will use these terms loosely and interchangeably. However, the idea of cases and bounds ARE distinct, and you would do well to make sure you understand them. Does this mean the difference will come up in your day-to-day? No. But when you are choosing between a few different algorithms, you want to read the fine print about the cases and bounds. Someone telling you that their algorithm has a Best Case Upper Bound of O(1) is probably trying to pull the wool over your eyes - make sure you ask them what the Worst Case Upper Bound is!


Let me illustrate this by an example:

The worst-case running time for quicksort is Theta(n^2). So a valid lower bound would be Omega(n) and an upper bound would be O(n^3). This says that in the worst case scenario, quicksort will take at least linear time and at most cubic time.

Now that isn't a very precise statement, but for more complex algorithms, such statements are the best that we can do.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜