开发者

What Mathematical Technique is used to compare algorithm complexity?

I started reading 'Introduction to Algorithms' today, but I've gotten a bit confused over one of the exercises.

Exercise 1.2.2 asks the reader

Suppose we are comparing implementations of Merge sort and Insertion Sort on the same machine. For inputs of size n, insertion sort runs in 8n^2 steps, while merge sort runs in 64n log n steps.

For which values of n does insertion sort beat merge sort?

I first tried opening up Wolfram Alpha and using it to draw graphs of the equations, but I couldn't accurately compare the two graphs.

I then tried choosing a random value for n (200), working out the equations on paper, and then modifying the value of n based on my results.

But that took too long.

What is the correct way t开发者_开发百科o solve this exercise?


See here: 8n2 = 64n log2 n. Just put both things in a single equation.

I.e., roughly n = 43 is the limit of insertion sort's usefulness here.

Usually you would solve this by solving above equation f(n) = g(n) by solving f(n) − g(n) = 0, however, the analytical result in this case isn't pretty as you're mixing a polynomial with a logarithm function. I'd just try out a few values and see where the result flips from positive to negative. Once you have one positive and one negative point you can use bisection to narrow it down.

The brute-force way would be to simply try out all n up to a certain point. You already know that O(n2) algorithms aren't suitable for large datasets, so the n has to be quite small. For my testing it looked like this:

PS Home:\> function lb($n){[math]::Log($n)/[math]::Log(2)}  # binary logarithm
PS Home:\> 1..80 | %{,($_,(8*$_*$_),(64*$_*(lb $_)))} | %{"{0}: delta={3}, I={1}, M={2}" -f $_[0],$_[1],$_[2],($_[2]-$_[1])}
...
38: delta=1210,9597126948, I=11552, M=12762,9597126948
39: delta=1024,36393828017, I=12168, M=13192,3639382802
40: delta=824,135922911648, I=12800, M=13624,1359229116
41: delta=610,216460117852, I=13448, M=14058,2164601179
42: delta=382,549232429308, I=14112, M=14494,5492324293
43: delta=141,080604940173, I=14792, M=14933,0806049402
44: delta=−114,240561917371, I=15488, M=15373,7594380826
45: delta=−383,463082570537, I=16200, M=15816,5369174295
46: delta=−666,633601368154, I=16928, M=16261,3663986318
47: delta=−963,796734153668, I=17672, M=16708,2032658463
48: delta=−1274,99519778461, I=18432, M=17157,0048022154
...

(Excuse the horrible code; this was just a very quick doodle.)


Is there some value of n where the two are equal?

What happens above that point? Below?


For n · 43, 8n2 · 64n log n and insertion sort beats merge sort Although merge sort runs in £(n log n) worst-case time and insertion sort runs in £(n2) worst-case time, the constant factors in insertion sort make it faster for small n. Therefore, it makes sense to use insertion sort within merge sort when subproblems become sufficiently small. Consider a modification of merge sort in which subarrays of size k or less (for some k) are not divided further, but sorted explicitly with Insertion sort.


Solving the equation 8n² < 64n lg n, we arrive in the equation 2^(n/8) - n < 0.

Once I was not in a test, just doing some exercises, i used a tool to see the graph of f(n) = 2^(n/8) - n.

As questioned here, there is no natural number of n where insertion sort have the same time as the merge sort, and, for f(n) < 0, we got:

2 <= n <= 43.

I hope it to help someone :)


Since this not an algebraic inequality, there isn't an analytical way to find a solution for n in

(1)   8n² < 64n log_2 (n)

so you need to approximate it. There are many numerical methods to do it, so we will do it using a quick Python script.

First, note that Python's math.log function is the logarithm to base e, so first you need to translate your original logarithm to base 2 into a logarithm to base e using the identity

(2)   log_2 (n) = log_e (n) / log_e (2)

(Proof.)

Using (2) in (1) and simplifying yields

(3)   n / log_e (n) <= 8 / log_e (2) 

Finally, evaluating n on (3) in the script

from math import log as ln

n = 2 # Since f(n) = n/ln (n) is not defined for n < 2.

while (n/ln (n) <= 8/ln (2)):
    n += 1

print (n)

outputs n = 44, which means that 44 is the smallest value that does not satisfy (1). Therefore, n <= 43 is the condition that satisfies (1), it is to say, insertion sorts beats merge sort while the array to be sorted has, at most, 43 elements.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜