How to create regular expression to match function definitions
I need to find function definitions like
function (param1, param2, param3)
I am using the following regular expr开发者_开发问答ession in python
\S+\\((\S+|\s+|,)\\)
so that something like
re.findall("\S+\\((\S+|\s+|,)\\)",source_code_string)
should give me the all the function names, but it's not working. Please suggest improvements to the above regular expression. I am new to regular expressions.
The answer is going to depend on what language the source files are written in. Recall that in Python, function definitions are prefixed by def
and suffixed by :
. Expanding on Stema's answer, try this for Python:
^\s*def (\S+)\s*\(\s*\S+\s*(?:,\s*\S+)*\):$
This should match only Python function definitions. The ^
and $
match only at the beginning and end of the line, respectively, so this will only find function defs on their own line, as they usually are for Python.
Your regex is fundamentally wrong
\S+\\((\S+|\s+|,)\\)
does mean match at least one non-whitespace, a bracket then a series of non-whitespace OR a series of whitspace OR a comma and then the closing bracket.
I think what you meant was this (use raw strings (r'') and escape only once)
(\S+)\s*\(\s*\S+\s*(?:,\s*\S+)*\)
See it here on Regexr
You can then find the name of your function in the capturing group 1 (because of the brackets around the first \S+
)
The \s*
are optional whitespaces
BUT this regex is so simple, I am sure it will not find all functions (it will fail on nested brackets) and it will find other stuff.
It's not exactly clear what you are looking for, but consider a few things.
\w+
will match any word, which can contain letters, numbers, underscores, and most other unicode word-like charactersUsing a raw string when dealing with python regex's is preferred, as you don't have to escape backslashes. This means that you need to prefix every regex pattern with an r, like
r'this'
. Otherwise, to match a literal backslash, you need to use\\\\
When in doubt, check the library docs, or another source on regex's.
精彩评论