Concurrent parsing algorithms

2023-03-22 23:14 问答作者：

Are there existing parser algorithms (开发者_运维技巧similar to LALR, SLR and LL) that can parse a single input, not just multiple inputs, in parallel?

Edit: Sorry, I wasn't really looking for research papers, more like, "There are compiler-compilers that generate concurrent parsers" or "This compiler for this language parses it in parallel"- real world examples.

There aren't any well known ones :-}

Much of the reason is the problem is described as parsing a string, presenting to the parser one token a time. That makes the problem sequential by definition, ugh.

One could imagine presenting the array of tokens to some parser all at once, and then have the parser parse substrings at various points across the array, stitching compatible trees for substrings together. The stitching process is likely to be complicated, but might be manageable if driven by an L(AL)R [better, a GLR] parser that swallowed nonterminals left-to-right after most of the parse trees for substrings were produced; think of this an an "accumulator".

[Shades, a quick Google search produces a 1990 Japanese paper on doing parallel GLR with what amounts to parallel Prolog]

You now have the problem of producing the array of tokens magically in parallel. Now you need a parallel lexer :-}

EDIT June 2013: I finally remembered McKeeman's 1982 paper on parallel LR parsing.

I'm working on a deterministic context free parsing algorithm with O(N) work complexity, O(log N) time complexity, fine-grained parallelism that is proportional to the length of the input string, and equivalent behavior to a LR parser. I'll be submitting it for peer review shortly.

The main idea is to process each character in the input stream independently, assume that it could match any rule, then piece together neighboring groups of characters until they unambiguously match a single rule. Once a rule is matched, it is filtered out by the algorithm. After all rules are matched, tokens are gathered together into a sequence.

There is some complexity involved in handling tokens with wildcards that could partially nest, and a post-processing step is needed to handle these and maintain the worst-case O(log N) complexity. This step probably is not needed in practice.

继续阅读：concurrency parsing

Concurrent parsing algorithms

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？