regular expressions in python
I find regular expressions pretty tough to understand in python. The documentation is too cryptic. For instance what would开发者_如何转开发 be the re for removing all instances of #if DEBUG
and everything enclosed between it and its corresponding #endif
in a C file. The following is not working:
buf = file.read()
a = re.compile("#if.DEBUG?#endif", re.MULTILINE + re.DOTALL)
string1 = re.sub(p_macro, '', buf)
If you want to remove all instances of #if DEBUG all you have to do is define DEBUG to 0, and run the preprocessor on it. No need for nasty regular expressions.
Also, it's generally not a good idea to operate on a context free grammar (C source, for example, or more notoriously, html) using regular expressions. Use a parsing library. Check out the eclipse sdk for example: http://help.eclipse.org/helios/index.jsp?topic=/org.eclipse.jdt.doc.isv/reference/api/overview-summary.html
Python's RegEx uses most of the syntax from PCRE. You could learn some of them from http://www.regular-expressions.info/tutorial.html.
Your code does not work because
#if.DEBUG?#endif
// ^^
the G?
actually means "one or zero G
character".
If you want to remove the whole #if DEBUG
block, try
re.compile(
r'^\s*#if\s+DEBUG' # match the '#if DEBUG' preprocessor.
r'.*?' # match all content in between until...
r'^\s*#endif' # ... getting a '#endif' and match it
,
re.S|re.M
)
but it will not work with nested #if
blocks, and it won't check if the preprocessor is within a comment /* ... */
. It's better to use a CPP parser for correctness.
If Kodos, the Python Regular Expression Debugger, is available on your development platform, you'll have an easier time crafting and testing regular expressions.
精彩评论