Find smallest subset sum matching another subset sum

2023-03-15 04:10 问答作者：

I have a real-world problem (not homework!) that requires finding the sum of a subset of set A that equals the sum of a subset of some other set B.

A very similar question with a helpful answer is here.

Consider this example:

@a = qw(200 2000 2000 2000 4000);
@b = qw(528 565 800 1435 2000 2000 2872);

Using the code provided in the accepted answer to that question, I get the following output:

sum(200 2000 4000) = sum(528 800 2000 2872)
sum(200 4000 4000) = sum(528 800 2000 2000 2872)
sum(200 4000) = sum(528 800 2872)
sum(200 2000 2000 2000 4000) = sum(528 800 2000 2000 2000 2872)

For my purposes, I only want the answer(s) that use the fewest elements of the input sets. In this example, I just want

sum(200 4000) = sum(528 800 2872)

because all of the other answers also have 200 and 4000 in the sum. That is, I'm looking for just the "simplest" possible sums (in the sense that they use the fewest elements). Can someone suggest a reasonably efficient way of doing this? (Brute force is okay as my arrays aren't th开发者_运维技巧at large.)

Also, I should note that the second line of the output, sum(200 4000 4000) ..., isn't correct because 4000 only appears once in @a. I'm afraid I don't understand the algorithm well enough to see why this is happening.

Any help with either of these would be much appreciated!

This problem is NP-complete, so unless P=NP you are stuck doing exponential work on the size of the input. Now the neat thing about this type of problem is that there are actually two ways to solve which will put the exponent on different aspects of the problem.

First if your sums don't have too many elements you can just brute force this problem by searching over all combinations. This method is exponential on the number of elements in the sets, and will work reasonably well up to say 20 or so elements per container. After that it is going to get pretty nasty.

A second option is to use dynamic programming. Unlike the previous method, this algorithm is exponential in the number of bits that it takes to write out each of the numbers. What you do is keep track of the set of all possible sums as well as the smallest number of elements required to generate them. This is a simple modification of the solution that you linked to in your answer.

Here is some python code that generates all the possible values they can intersect at:

    def gen_sum(a):
        r = { 0: [] }
        for x in a:
            for y, v in r.items():
                if not (x+y) in r or len(r[x+y]) > len(v) + 1:
                    r[x+y] = v + [x]
        return r

    def gen_sums(a, b):
        asum = gen_sum(a)
        bsum = gen_sum(b)
        return [ (k, v, bsum[k]) for k,v in asum.items() if k in bsum ][1:]

Running it on your sample data, I got:

[(4000, [4000], [2000, 2000]), (6000, [2000, 4000], [565, 1435, 2000, 2000]), (2000, [2000], [2000]), (4200, [200, 4000], [528, 800, 2872]), (10200, [200, 2000, 2000, 2000, 4000], [528, 565, 800, 1435, 2000, 2000, 2872]), (8200, [200, 2000, 2000, 4000], [528, 800, 2000, 2000, 2872]), (6200, [200, 2000, 4000], [528, 800, 2000, 2872])]

EDIT: Fixed a typo, and also just noticed that holy crap tons of people have already answered this really quick.

Here's an updated algorithm which gives all the sums:

my @a = qw(200 2000 2000 2000 4000);
my @b = qw(528 565 800 1435 2000 2000 2872);

my %a = sums( @a );
my %b = sums( @b );

for my $m ( keys %a ) {
    print "sum(@{$a{$m}}) = sum(@{$b{$m}})\n" if exists $b{$m};
}

sub sums {
    my( @a ) = @_;

    return () unless @a;

    my %a;

    while( @a ) {
        $a = shift @a;

        for my $m ( keys %a ) {
            $a{$m+$a} = [@{$a{$m}},$a];
        }

        $a{$a} = [$a];
    }

    return %a;
}

Al you have to do is find the shortest one(s), but others have covered that already :)

HTH,

Paul

You need to modify the recurrence relation, not just the output. Consider {1,2,20,23,42} and {45}. The original algorithm will output {1,2,42},{45} instead of {20,23},{45}. This is because 42 is considered last, and when it sums to 45, it will overwrite the bucket value at 45 previously containing {20,23}

Instead of keeping [set,sum] for each value, you need to keep [ minimum set, sum ] and then take the minimums at the end.

My perl is awful, but something like this

$a{$m+$a} = [min-length(@{$a{$m}},$a{$m+$a}[0]),$a];

where min-length returns the smaller set

I'm not great with perl. But,

for $m ( keys %a ) {
print "sum(@{$a{$m}}) = sum(@{$b{$m}})\n" if exists $b{$m};

}

Modify this line to count the number of elements in set $a and $b for each $m. When you've finished looping through all of them, select the one with the least number of elements.

继续阅读：algorithm perl subset-sum

Find smallest subset sum matching another subset sum

更多精彩内容

精彩评论

最新问答

客厅装修如何打造电视墙?你喜欢什么样的电视墙?？

在月经期同房会怎样？

在国外旅游时，想尝试当地美食但看不懂菜单怎么办，可以用线上翻译器吗？？

治疗输卵管堵塞去什么医院？

哪个专治疗输卵管堵了医院好？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？