Small problem with reg exps in python

2023-01-30 08:50 问答作者：

So I have one variable that has all the code from some file. I need to remove all comments from this file. One of my regexp lines is this

x=re.sub('\/\*.*\*\/','',x,re.M,re.S);

What I want this to be doing is to remove all multi line comments. For so开发者_StackOverflow中文版me odd reason though, its skipping two instances of */, and removing everything up to the third instance of */.

I'm pretty sure the reason is this third instance of */ has code after it, while the first two are by themselves on the line. I'm not sure why this matters, but I'm pretty sure thats why.

Any ideas?

.* will always match as many characters as possible. Try (.*?) - most implementations should try to match as few characters as possible then (should work without the brackets but not sure right now). So your whole pattern should look like this: \/\*.*?\*\/ or \/\*(.*?)\*\/

The expression .* is greedy, meaning that it will attempt to match as many characters as possible. Instead, use (.*?) which will stop matching characters as soon as possible.

The regular expression is "greedy" and when presented with several stopping points will take the farthest one. Regex has some patterns to help control this, in particular the

(?&gt!...)

which matches the following expression only if it is Not preceeded by a match of the pattern in parens. (put in a pointy brace for &gt in the above - I don't know the forum convention for getting on in my answer).

(?*...) was not in Python 2.4 but is a good choice if you are using a later version.

继续阅读：python regex

Small problem with reg exps in python

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？