开发者

Find shortest combination of words that contain every character from a specified set

I got an array (let's call it a1) of words (like dog, fish, run, programming anything) that's really huge.

I can combine any of the words out of a1 with any other of the words (e.g. you could combine dog and programming into dog-programming), and then again and again, until the strings gets really big.

I also got an array (let's call it a2) of strings (such as de, s, x?, umh, again they could be pretty much anything). It is guaranteed that there are no strings in a2 that cannot be found in any string of a1.

What I'm looking for is the shortest string (shortest in terms of how many combinations it takes to create that string, not how many characters that string contains) that contains all of the strings inside a2. If there are more than one string that all feature that minimum length, I can just pick any, and bail out of the program.

Now, I don't think I can just brute-force this because even if the arrays are relative small, there are virtually endless options of combining the words, but I'd love to be proven wrong!

Is there any good way to get the shortest string that will surely yield the shortest result or do I have to use heuristical algorithms to just search for pretty short strings?

I tried开发者_开发技巧 just picking a string from a1 that covered most strings from a2, then removing those items from a2, and starting again, but it doesn't work! It yields pretty good results, but not the best.


if you combine the words with a dash like in your example, e.g.

dog + programming + sky = dog-programming-sky

AND the words in A2 do not contain dashes, then it's a just SET-COVER in disguise, an NP-complete optimization problem. You can then solve the problem using any solution strategy available for SET-COVER. There is a fast approximation algorithm for SET-COVER, but if you want to have the absolute smallest solution, you'd need to resort to a worst-case exponential algorithm.

If you combine the words without a dash, e.g.

dog + programming + sky = dogprogrammingsky

then the problem is more difficult, because now e.g. "ogpro" is found in the combined word even if it isn't a substring of any of the constituent strings.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜