开发者

Regex in Java interpreting a source.c file

I have to recognize some characters in a .c file. For now I have to recognize the #define line but I would like to exclude the comments after the definition. For example:

#define KERNEL_VERSION(a,b,c) ((a)*65536+(b)*256+(c)) /* We're doing kernel work */

I have these results :

group1="KERNEL_VERSION"
group2="(a,b,c) ((a)*65536+(b)*256+(c)) /* We're doing kernel work */"

I would like to get rid of /* We're doing kernel work */

I have tried everything I just can't get rid of it. Here is where I am:

Pattern cdef = Pattern.compile("^#[\\t ]*define[\\t ]+(\\w+)[\\t ]*(.*)",Pattern.DOTALL);

I have tried adding ^[\\/\\*\\w+] or [\\t ]+^\\/+\\*\\w*\\ ..... at the 开发者_如何学Goend of the string but either I lose all the second group or it does nothing

thanks a lot,

!!!! EDIT: I would like to find a way to eliminate a C comment so: /* comment */ from a pattern

EDIT 2: The way I see it I think it there should be a way to give the following istruction: "if you find "/", don't take anything else, I am reading the file line by line so whatever is after the / can be thrown away:

This is where I am treating the second group: "....(.)" So I have tried adding ^[\/\] at the end of my string but it doesnt work and I lose the whole second part


You almost have it. Just specify the comment at the end of your string. Like this:

(\\/\\*.*\\*\\/)

Complete test program:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class TestMain {
    public static void main(String[] args) {
        Pattern cdef = Pattern.compile("^#[\\t ]*define[\\t ]+(\\w+)[\\t ]*(.*)(\\/\\*.*\\*\\/)", Pattern.DOTALL);
        Matcher matcher = cdef
                .matcher("#define KERNEL_VERSION(a,b,c) ((a)*65536+(b)*256+(c)) /* We're doing kernel work */");
        System.out.println(matcher.matches());
        for (int n = 0; n <= matcher.groupCount(); n++)
            System.out.println(matcher.group(n));
    }
}

Output:

true
#define KERNEL_VERSION(a,b,c) ((a)/65536+(b)/256+(c)) /* We're doing kernel work */
KERNEL_VERSION
(a,b,c) ((a)/65536+(b)/256+(c)) 
/* We're doing kernel work */


To me an easy way is to preprocess source sequence char-by-char and skip all between like:

// don't take all literally, pseudocode below
while(!EOF)
{
    // read next char
    ReadChar();

    // check for comment start
    if(prevChar == '/' && curChar == '*')
    {
        // remove '/' from output
        OutputContainer.RemoveLastChar();
        while(!(prevChar == '*' && curChar == '/'))
        {
            // skip next char
            SkipChar();
        }
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜