python script to match C function signature in multiple lines
I am reading .c file to look out for functions defined in it and count number of lines in each function. My problem is that I am unable to look for function name/signature spanned across multiple ines. I have the list of function names of .c file and i am matching the function names of this list with functions in .c file to process further.
e.g. My .c file is:
1. int main(
2. void
3. )
here main signature is spanned over three lines.
and I have a list of functions as:
int main(void);
how can i match "int main(void)" with multiline main in .c file ? I want to start counting lines 开发者_StackOverflow中文版once function is matched.
I suggest you write a simple parser for the C language.
One of the examples in the ANTLR book does something similar to what you're after.
Pyparsing is a very nice Python library for writing parsers.
Here is a parser for ANSI C: http://code.google.com/p/pycparser/ (written using another Python parser library, Ply).
If you have definition that matches "exactly" then you can use regex:
int\s+main\s*\(\s*void\s*\)\s*;
where \s* means zero or more whitechars, and \s+ one or more whitechars.
To use it with multiline search you can define it like:
RE_MAIN = re.compile(r'int\s+main\s*\(\s*void\s*\)\s*;', re.MULTILINE | re.DOTALL)
By "exactly" I mean that it does not match function definition like
int main();
(void omitted)
This way you can find where the function begins, then do simple char scanner counting { and } remembering to ignore comments and ignore chars and strings constants
精彩评论