Regex in Java interpreting a source.c file
I have to recognize some characters in a .c file. For now I have to recognize the #define line but I would like to exclude the comments after the definition. For example:
#define KERNEL_VERSION(a,b,c) ((a)*65536+(b)*256+(c)) /* We're doing kernel work */
I have these results :
group1="KERNEL_VERSION"
group2="(a,b,c) ((a)*65536+(b)*256+(c)) /* We're doing kernel work */"
I would like to get rid of /* We're doing kernel work */
I have tried everything I just can't get rid of it. Here is where I am:
Pattern cdef = Pattern.compile("^#[\\t ]*define[\\t ]+(\\w+)[\\t ]*(.*)",Pattern.DOTALL);
I have tried adding ^[\\/\\*\\w+]
or [\\t ]+^\\/+\\*\\w*\\
..... at the 开发者_如何学Goend of the string but either I lose all the second group or it does nothing
thanks a lot,
!!!! EDIT: I would like to find a way to eliminate a C comment so: /* comment */ from a pattern
EDIT 2: The way I see it I think it there should be a way to give the following istruction: "if you find "/", don't take anything else, I am reading the file line by line so whatever is after the / can be thrown away:
This is where I am treating the second group: "....(.)" So I have tried adding ^[\/\] at the end of my string but it doesnt work and I lose the whole second part
You almost have it. Just specify the comment at the end of your string. Like this:
(\\/\\*.*\\*\\/)
Complete test program:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TestMain {
public static void main(String[] args) {
Pattern cdef = Pattern.compile("^#[\\t ]*define[\\t ]+(\\w+)[\\t ]*(.*)(\\/\\*.*\\*\\/)", Pattern.DOTALL);
Matcher matcher = cdef
.matcher("#define KERNEL_VERSION(a,b,c) ((a)*65536+(b)*256+(c)) /* We're doing kernel work */");
System.out.println(matcher.matches());
for (int n = 0; n <= matcher.groupCount(); n++)
System.out.println(matcher.group(n));
}
}
Output:
true
#define KERNEL_VERSION(a,b,c) ((a)/65536+(b)/256+(c)) /* We're doing kernel work */
KERNEL_VERSION
(a,b,c) ((a)/65536+(b)/256+(c))
/* We're doing kernel work */
To me an easy way is to preprocess source sequence char-by-char and skip all between like:
// don't take all literally, pseudocode below
while(!EOF)
{
// read next char
ReadChar();
// check for comment start
if(prevChar == '/' && curChar == '*')
{
// remove '/' from output
OutputContainer.RemoveLastChar();
while(!(prevChar == '*' && curChar == '/'))
{
// skip next char
SkipChar();
}
}
}
精彩评论