Does an algorithm exist which can determine whether one regular language matches any input another regular language matches?

2023-01-14 11:50 问答作者：

Let's say we have regular expressions:

Hello W.*rld
Hello World
.* World
.* W.*

I would like to minimize the number of regexes required to match arbitrary input.

To do that, I need to find if one regular expression matches any input matched by another expression. Is that possibl开发者_如何学编程e?

Billy3

Any regular expression can be linked to a DFA - you can minimize the DFA and since the minimal form is unique, you can decide whether two expressions are equivalent. Dani Cricco pointed out the Hopcroft O(n log n) algorithm. There is another improved algorithm by Hopcroft and Craft which tests the equivalence of two DFAs in O(n).

For a good survey on the matter and an interesting approach to this, I reccomend the paper Testing the Equivalence of Regular Languages, from arXiv.

Later edit: if you are interested in inclusion rather than equivalence for regular expressions, I have come across a paper that might be of interest: Inclusion Problem for Regular Expressions - I have only skimmed through it but it seems to contain a polynomial time algorithm to the problem.

Yes.

The problem of equivalence of two regular languages is decidable.

Sketch of an algorithm:

minimize both DFAs
check if they are isomorph

Sure!. A regular expression can be represented as an FSM (Finite State Machine) and there are technically infinite number of FSM that can recognize the same string.

Isomorphism is the name that describes if two FSM are equivalent. There are a couple of algorigthm to minimize an FSM. For example the Hopcroft minimization algorithm can minimize two FSM in O(n log n), on an n state automaton.

This problem is called "inclusion" or "subsumption" of regular expressions, because what you are asking for, is whether the set of words matched by one regexp includes (or subsumes) the set of words matched by the other regex. Equality is a different question which usually means whether two regexps matches exactly the same words, i.e. that they are functionally equivalent. For example "a*" includes "aa*", while they are not equal.

All known algorithms for regexp inclusion are the worst case take time exponential in the size of the regexp. But the standard algorithm is like this:

Input r1 and r2 Output Yes if r1 includes r2

Create DFA(r1) and DFA(r2)
Create Neg(DFA(r1)) (which matches exactly those words r1 dont match)
Create Neg(DFA(r1))x DFA(r2) (which matches exactly those words matched by Neg(DFA(r1)) and DFA(r2))
Check that the automaton made in 3. does not match any word

This works, since what you are checking is that there are no words matched by r2 that are not matched by r1.

继续阅读：computer-science regex theory

Does an algorithm exist which can determine whether one regular language matches any input another regular language matches?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？