开发者

best way to parse a language that's ALMOST Python?

I'm working on a domain-specific language implemented on top of Python. The grammar is so close to Python's that until now we've just been making a few trivial string transformations 开发者_开发知识库and then feeding it into ast. For example, indentation is replaced by #endfor/#endwhile/#endif statements, so we normalize the indentation while it's still a string.

I'm wondering if there's a better way? As far as I can tell, ast is hardcoded to parse the Python grammar and I can't really find any documentation other than http://docs.python.org/library/ast.html#module-ast (and the source itself, I suppose).

Does anyone have personal experience with PyParsing, ANTLR, or PLY?

There are vague plans to rewrite the interpreter into something that transforms our language into valid Python and feeds that into the Python interpreter itself, so I'd like something compatible with compile, but this isn't a deal breaker.

Update: It just occurred to me that

from __future__ import print_function, with_statement

changes the way Python parses the following source. However, PEP 236 suggests that this is syntactic window dressing for a compiler feature. Could someone confirm that trying to override/extend __future__ is not the correct solution to my problem?


PLY works. It's odd because it mimics lex/yacc in a way that's not terribly pythonic.

Both lex and yacc have an implicit interface that makes it possible to run the output from lex as a stand-alone program. This "feature" is carefully preserved. Similarly for the yacc-like features of PLY. The "feature" to create a weird, implicit stand-alone main program is carefully preserved.

However, PLY as lex/yacc-compatible toolset is quite nice. All your lex/yacc skills are preserved.

[Editorial Comment. "Fixing" Python's grammar will probably be a waste of time. Almost everyone can indent correctly without any help. Check C, Java, C++ and even Pascal code, and you'll see that almost everyone can indent really well. Indeed, people go to great lengths to indent Java where it's not needed. If indentation is unimportant in Java, why do people do such a good job of it?]

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜