Partition an array in order
This is an interview ques开发者_开发百科tion that a friend of mine got and I'm unable to come up with how to solve it.
Question:
You are given a array of n buttons that are either red or blue. There are k containers present. The value of a container is given by the product of red buttons and blue buttons present in it. The problem is to put the buttons into the containers such that the sum of all values of the containers is minimal. Additionally, all containers must contain the buttons and they must be put in order they are given. For example, the very first button can only go to the first container, the second one can go to either the first or the second but not the third (otherwise the second container won't have any buttons). k will be less than or equal to n.
I think there must be a dynamic programming solution for this.
How do you solve this ? So far, I've only got the trivial cases where
- if (n==k), the answer would be zero because you could just put one in each container making the value of each container zero, therefore the sum would be zero.
- if (k==1), you just dump all of them and calculate the product.
- if only one color is present, the answer would be zero.
Edit:
I'll give an example.
n = 4 and k = 2
Input: R B R R
The first container gets the first two (R and B) making its value 1 (1R X 1B) The second container gets the remaining (R and R) making its value 0 (2R x 0B) The answer is 1 + 0 = 1
if k=3, the first container would have only the first button (R) the second container would have only the second one (B) the third one would have the last two buttons (R and R) Each of the containers would have value 0 and hence sum and answer would be 0.
Hope this clears up the doubts.
Possible DP solution:
Let dp[i, j] = minimum number possible if we put the first i numbers into j containers
.
dp[i, j] = min{dp[p, j - 1] + numRed[p+1, i]*numBlues[p+1, i]}, p = 1 to i - 1
Answer will be in dp[n, k]
.
int blue = 0, red = 0;
for (int i = 1; i <= n; ++i)
{
if (buttons[i] == 1)
++red;
else
++blue;
dp[i][1] = red * blue;
}
for (int i = 2; i <= n; ++i)
for (int j = 2; j <= k; ++j)
{
dp[i][j] = inf;
for (int p = 1; p <= i; ++p)
dp[i][j] = min(dp[p][j - 1] + getProd(p + 1, i), dp[i][j]);
}
return dp[n][k];
Complexity will be O(n^3*k)
, but it's possible to reduce to O(n^2*k)
by making getProd
run in O(1)
with the help of certain precomputations (hint: use dp[i][1]
). I'll post it tomorrow if no one figures out this is actually wrong until then.
It might also be possible to reduce to O(n*k)
, but that will probably require a different approach...
If I understand the question correctly, as long as every container has at least one button in it, you can choose any container to put the remaining buttons in. Given that, put one button in every container, making sure that there is at least one container with a red button and at least one with a blue button. Then with the remaining buttons, put all the red buttons in a container with a red button and put all the blue buttons in a container with blue buttons in it. This will make it so every container has at least one button and every container has only one color of buttons. Then every container's score is 0. Thus the sum is 0 and you have minimized the combined score.
Warning: Proven to be non-optimal
How about a greedy algorithm to get people talking? I'm not going to try to prove it's optimal at this point, but it's a way of approaching the problem.
In this solution, we use the G to denote the number of contiguous regions of one colour in the sequence of buttons. Say we had (I'm using x for red and o for blue since R and B look too similar):
x x x o x o o o x x o
This would give G = 6. Let's split this into groups (red/blue) where, to start with, each group gets an entire region of a consistent colour:
3/0 0/1 1/0 0/3 2/0 0/1 //total value: 0
When G <= k, you have a minimum of zero since each grouping can go into its own container. Now assume G > k. Our greedy algorithm will be, while there are more groups than containers, collapse two adjacent groups into one that result in the least container value delta (valueOf(merged(a, b)) - valueOf(a) - valueOf(b)
). Say k = 5 with our example above. Our choices are:
Collapse 1,2: delta = (3 - 0 - 0) = 3
2,3: delta = 1
3,4: delta = 3
4,5: delta = 6
5,6: delta = 2
So we collapse 2 and 3:
3/0 1/1 0/3 2/0 0/1 //total value: 1
And k = 4:
Collapse 1,2: delta = (4 - 0 - 1) = 3
2,3: delta = (4 - 1 - 0) = 3
3,4: delta = (6 - 0 - 0) = 6
4,5: delta = 2
3/0 1/1 0/3 2/1 //total value: 3
k = 3
4/1 0/3 2/1 //total value: 6
k = 2
4/1 2/4 //total value: 12
k = 1
6/5 //total value: 30
It seems optimal for this case, but I was just intending to get people talking about a solution. Note that the starting assignments of buttons to containers was a shortcut: you could instead start with each button in the sequence in its own bucket and then reduce, but you would always arrive to the point where each container has the maximum number of buttons of one colour.
Counterexample: Thanks to Jules Olléon for providing a counter-example that I was too lazy to think of:
o o o x x o x o o x x x
If k = 2, the optimal mapping is
2/4 4/2 //total value: 16
Let's see how the greedy algorithm approaches it:
0/3 2/0 0/1 1/0 0/2 3/0 //total value: 0
0/3 2/0 1/1 0/2 3/0 //total value: 1
0/3 3/1 0/2 3/0 //total value: 3
0/3 3/1 3/2 //total value: 9
3/4 3/2 //total value: 18
I'll leave this answer up since it's accomplished its only purpose of getting people talking about a solution. I wonder if the greedy heuristic could be used in an informed search algorithm such as A* to improve the runtime of an exhaustive search, but that would not achieve polynomial runtime.
I always ask for clarifications of the problem statement in an interview. Imagine that you never put blue an red buttons together. Then the sum is 0, just like n==k. So, for all cases where k > 1, then the minimum is 0.
Here is what I understand so far: The algorithm is to process a sequence of values {R,B}. It may choose to put the value in the current container or the next, if there is a next.
I first would ask a couple of questions to clarify the things I don't know yet:
Is k and n known to the algorithm in advance? I assume so.
Do we know the full sequence of buttons in advance?
If we don't know the sequence in advance, should the average value minimized? Or the maximum (the worst case)?
Idea for a proof for the algortihm by Mark Peters
Edit: Idea for a proof (sorry, couldn't fit it in a comment)
Let L(i) be the length of the ith group. Let d(i) be the diff you get by collapsing container i and i+1 => d(i) = L(i)*L(i+1).
We can define a distribution by the sequence of containers collapsed. As index we use the maximum index of the original containers contained in the collapsed container containing the containers with the smaller indexes.
A given sequence of collapses I = [i(1), .. i(m)] results in a value which has a lower bound equal to the sum of d(i(m)) for all m from 1 to n-k.
We need to proof that there can't be a sequence other then the one created by the algorithm with a smaller diff. So let the sequence above be the one resulting from the algorithm. Let J = [j(1), .. j(m)].
Here it gets skimpy: I think it should be possible to proof that the lower limit of J is larger then the actual value of I because in each step we choose by construction the collapse operation from I so it must be smaller then the matching collapse from the alternate sequence
I think we might assume that the sequences are disjunct, but I'm not completely sure about it.
Here is a brute force algorithm written in Python which seems to work.
from itertools import combinations
def get_best_order(n, k):
slices = combinations(range(1, len(n)), k-1)
container_slices = ([0] + list(s) + [len(n)] for s in slices)
min_value = -1
best = None
def get_value(slices, n):
value = 0
for i in range(1, len(slices)):
start, end = slices[i-1], slices[i]
num_red = len([b for b in n[start:end] if b == 'r'])
value += num_red * (end - start - num_red)
return value
for slices in container_slices:
value = get_value(slices, n)
if value < min_value or min_value == -1:
min_value = value
best = slices
return [n[best[i-1]:best[i]] for i in range(1, len(best))]
n = ['b', 'r', 'b', 'r', 'r', 'r', 'b', 'b', 'r']
k = 4
print(get_best_order(n, k))
# [['b', 'r', 'b'], ['r', 'r', 'r'], ['b', 'b'], ['r']]
Basically the algorithm works like this:
- Generate a list of every possible arrangement (items stay in order, so this is just a number of items per container)
- Calculate the value for that arrangement as described by the OP
- If that value is less than the current best value, save it
- Return the arrangement that has the lowest value
精彩评论