Search backward through a string using a regex (in Python)?

2022-12-23 18:45 问答作者：

Context

I'm parsing some code and want to match the doxygen comments before a function. However, because I want to match for a specific function name, getting only the immediately previous comment is giving me problems.

Current Approach

import re  
function_re = re.compile(
    r"\/\*\*(.+)\*\/\s*void\s+(\w+)\s*::\s*function_name\s*\(\s*\)\s*")  
function_match = function_re.search(file_string)
if function_match:  
开发者_C百科    function_doc_str = update_match.group(2)

Problem with Current Approach

The current approach matches doxygen from earlier functions, giving me a result that is the wrong doxygen comment.

Question

Is there a way to search backward through a string using the Python Regex library?

It seems like my problem is that the more restrictive (less frequently occurring part) is the function signature, "void function()"

Possible better question

Is there a better (easier) approach that I'm missing?

simplest way is to just use a group, you don't need to go backwards...

 (commentRegex)functionRegex

Then just extract group 1. You will need to run in multi-line mode to get it working, i don't know python so i can't be more helpful.

It's also possible with lookahead assertions, but this way is simpler.

I think you should use a regex that only matches doxymentation that's immediately before the function. Maybe something like this (simplified example):

import re

test = """

/**
    @doxygen comment
*/
void function()
{
}

"""

doxygenRegex = r"(?P<comment>/\*\*(?:[^/]|/(?!\*\*))*\*/)"
functionRegex = r"(?P<function>\s\w+\s+(?P<functionName>\w+)\s*\()"

match = re.search(doxygenRegex + functionRegex, test)
print match.groupdict()

As long as this matches something, you can loop the regex matching - but starting the search at test[match.end():] next time. Hope that makes sense to you...

BTW if you only want to extract the comment and nothing about the function, you can use lookahead - just replace functionRegex with r"(?=\s\w+\s+\w+\s*\()".

This can be achived using a single reg-ex.

The key is to capture the comment just before the desired function. The easy way to do this is to use non-greedy qualifier. For example: /\*\*(.*?)\*/ with MULTILINE flag; however, in Python, non-greedy and MULTILINE do not work together (at least on my environment). So, you need a little trick like this:

/\*\*((?:[^\*]|\*(?!/))*)\*/.

This is to match:

1: the comment begin /**.

2: everything that is not * OR * that does not follows by /

3: the comment end */.

From this idea the code you want is:

function_name  = "function2"
regex_comment  = "/\*\*((?:[^\*]|\*(?!/))*)\*/"
regex_static   = "(?:(\w+)\s*::\s*)?"
regex_function = "(\w+)\s+"+regex_static+"(?:"+function_name+")\s*\([^\)]*\)"
regex = re.compile(regex_comment+"\s*"+regex_function, re.MULTILINE)
text  = """
/**
    @doxygen comment1
*/
void test::function1()
{
}

/**
    @doxygen comment2
*/
void test::function2()
{
}
"""
match = regex.search(text)
if (match == None): print "None"
else:               print match.group(1)

When run, you got:


    @doxygen comment2

Variation: If you want to capture /** and */ too, use regex_comment = "(/\*\*(?:[^\*]|\*(?!/))*\*/)".

Hope this helps.

Note that C isn't a regular language, so it cannot be parsed by regular expressions. Have you considered leveraging doxygen itself to parse this file?

You can do look-behind assertions with (?<=...) or (?<!...), but in general you can only match forwards.

The question is why are these comments not inside the function, so you can use doc.

But there is no easy way with regex.

here's a non regex approach, split on */ and find if the function you are looking for is at the next item. eg

test = """

/**
    @doxygen comment
*/
void function()
{
}

"""

t=test.split("*/")
for n,comm in enumerate(t):
    try:
        if "void" in t[n+1]:
             print t[n]
    except IndexError: pass

继续阅读：python regex

Search backward through a string using a regex (in Python)?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？