Does this scenario equate to any well known computer science problems?

2023-04-02 18:41 问答作者：

I'm working on a problem in the field of survey data integration, but it's easier to describe with an artificial analogy. Since the setup is a bit lengthy I'll pose my questions at the outset:

Does this scenario equate to any computer science problems whose solutions I could simply borrow and adapt?
If not, what approaches might you suggest? I'm not asking anyone to solve the problem; rather, just hoping to be pointed in one or more promising directions.

Imagine that ecosystems have creatures, which consist of one or more molecules, which consist of one or more atoms. To be viable creature, its molecules must collectively use exactly one of each type of atom. In the example below, note that every creature uses all four atoms just once. Also worth noting is that the ordering of molecules within creatures and of atoms within molecules is irrelevant.

Atoms in the universe: a b c d.

Ecosystem X
    creature x1
        molecule 1: a b
        molecule 2: c
        molecule 3: d
    creature x2
        molecule 4: a b c
        molecule 5: d
    creature x3
        molecule 6: a b c d

Ecosystem Y
    creature y4
        molecule 7: a b
        molecule 8: c
        molecule 9: d
    creature y5
        molecule 10: a b
        molecule 11: c d开发者_StackOverflow中文版

Given two ecosystems, my task is to produce "pairings". A pairing consists of a set of molecules (or molecule combinations) from one ecosystem that map to equivalent molecules (or molecule combinations) from the other ecosystem. Equivalence is determined by the underlying atoms. Like creatures, pairings are not viable unless each of the two sets of molecules (one from each ecosystem) uses all of the atoms exactly once. Here are some (not all) of the pairings from the example above:

# A direct mapping from the molecules of creature x1 to those of y4.
m1 = m7
m2 = m8
m3 = m9

# As above, but substitute m10 for m7.
m1 = m10
m2 = m8
m3 = m9

# We can combine molecules.
m4 = m7 + m8
m5 = m9

# Another combination.
m1      = m10
m2 + m3 = m11

In the real problem domain, there could anywhere from 2 to 100 atoms in play (with corresponding variety in molecule sizes) and a couple dozen creatures per ecosystem. Also, it's possible for two ecosystems to produce no viable pairings. In that case, my Python application will eventually need to suggest approximate pairings (a list of the molecule combinations that pair up, followed by a listing of the stragglers from each ecosystem).

This smells like some flavor of covering problem.

Index (hash) molecules by their atom subsets, producing mappings like {a, b} -> {m1, m7, m10}
Select an ecosystem and, by enumerating partitions, discover and index atom subsets with their other-ecosystem expansions (such as {a, b, c} -> {{{a, b}, {c}}} for m4 = m7 + m8.)
Discard any atom subsets that don't have an expansion (understanding that m1 = m7 counts as an expansion.)
From the remainder, enumerate partitions of the alphabet (set of all atoms.) From step 3, we know that any discovered partitions will be translatable into potentially many partitions in the other ecosystem, via the mapping already computed.
Select the other ecosystem and repeat steps 2-4.
De-dupe results (possibly by accumulating them in a hash set.)
Expand partitions composed of subsets of atoms back into collections of molecules with the mapping built in step 1.

The one part that seems tricky off-hand is the sub-routine that accepts a collection of subsets and enumerates the constructible partitions of some target set. Depending on the exact semantics, that may in fact be NP-hard.

继续阅读：algorithm combinations data-structures search

Does this scenario equate to any well known computer science problems?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？