How to create regexp parsing pascal-like function declaration with body?
How to create (and is this possible) regexp parsing pascal-like function declaration with body ? I've created some regexp
function\s+(\w+)(\(((((var\s*)?(\w+)(\s*\,+\s*)?)+?\s*\:\s*(\w+)\s*\;?\s*?)\s*)+\))?\s*\:\s*(\w+)
which can pool only functions prototypes (it works only if there is no comments, so i clear comm开发者_Python百科ents before parsing ) and i have no idea how to change it to make it pool functions with bodies. The problem is there are can be many of "begin - end" blocks, so it is hard to find functions ending
Sorry, but you are using the wrong tool. Programming languages have a context-free structure that regular expressions simply cannot recognize reliably. Properly nested parentheses like { () [] } { }
are an example for such a context-free structure for which you cannot find a regular expression that checks the proper nesting.
To solve the problem, you could use regular expression to break down program code into a stream of tokens and then use a (manually coded) top-down parser to check the structure of this token stream. To learn about this, consult any book about compiler design. Scanning (breaking into tokens) and parsing (checking structure) are always the first chapters. The Wikipedia entry for a top-down parser provides an example.
精彩评论