Where can I find a formal grammar for the Perl programming language?
I understand that the Perl syntax is ambiguous and that its disambiguation is non-trivial (sometimes involving execution of code duri开发者_JAVA技巧ng the compile phase). Regardless, does Perl have a formal grammar (albeit ambiguous and/or context-sensitive)?
From perlfaq7
Can I get a BNF/yacc/RE for the Perl language?
There is no BNF, but you can paw your way through the yacc grammar in perly.y in the source distribution if you're particularly brave. The grammar relies on very smart tokenizing code, so be prepared to venture into toke.c as well.
In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF. The work of parsing perl is distributed between yacc, the lexer, smoke and mirrors."
To see the wonderful set of examples of WHY it's pretty much near impossible to parse Perl due to context influences, please look into Randal Schwartz's post: On Parsing Perl
In addition, please see the discussion in "Perl 5 Internals (Chapter 5. The Lexer and the Parser)" by Simon Cozens.
Please note that the answer is different for Perl6:
There exists a grammar for Perl6
Rakudo Perl has its own version of the grammar
Other people have posted this link before on similar questions, but I think it is fun and has a great case example: Perl Cannot Be Parsed (A Formal Proof).
From that link:
[Consider] the following devilish snippet of code, concocted by Randal Schwartz, and determine the correct parse for it:
whatever / 25 ; # / ; die "this dies!";
Schwartz's Snippet can parse two different ways: if whatever is nullary (that is, takes no arguments), the first statement is a division in void context, and the rest of the line is a comment. If whatever takes an argument, Schwartz's Snippet parses as a call to the whatever function with the result of a match operator, then a call to the die() function.
This means that, in order to statically parse Perl, it must be possible to determine from a string of Perl 5 code whether it establishes a nullary prototype for the whatever subroutine.
I just post this part to show that it gets really hard really quickly.
Alternatively, many code/text editors can do a decent (though never great) job of syntax highlighting so you may start at those specs to see what they do. In fact you have inspired me, I think I will post a related question asking what editor best highlights Perl.
There is no formal grammar in the sense "this is the specification of Perl 5" (The Perl 6 effort is trying to fix that, though). But there is a formal grammar in the Perl 5 source code. Of course, understanding the code is most likely not a trivial undertaking.
Jeffrey Kegler has written some good articles about the perl grammar as well on his blog. In particular see, this post and this one. The rest of the blog has some quite interesting thoughts on parsing in general as well.
精彩评论