Order of precedence for token matching in Flex
My apologies if the title of this thread is a little confusing. What I'm asking about is how does Flex (the lexical analyzer) handle issues of precedence?
For example, let's say I have two tokens with similar regular expressions, written in the following order:
"//"[!\/]{1} return FIRST;
"//"[!\/]{1}\< return SECOND;
Given the input "//!<", will FIRST or SECOND be returned? Or both?
The FIRST string would be reached before the SECOND string, but it seems 开发者_Python百科that returning SECOND would be the right behavior.The longest match is returned.
From flex & bison, Text Processing Tools:
How Flex Handles Ambiguous Patterns
Most flex programs are quite ambiguous, with multiple patterns that can match the same input. Flex resolves the ambiguity with two simple rules:
- Match the longest possible string every time the scanner matches input.
- In the case of a tie, use the pattern that appears first in the program.
You can test this yourself, of course:
file: demo.l
%%
"//"[!/] {printf("FIRST");}
"//"[!/]< {printf("SECOND");}
%%
int main(int argc, char **argv)
{
while(yylex() != 0);
return 0;
}
Note that /
and <
don't need escaping, and {1}
is redundant.
bart@hades:~/Programming/GNU-Flex-Bison/demo$ flex demo.l
bart@hades:~/Programming/GNU-Flex-Bison/demo$ cc lex.yy.c -lfl
bart@hades:~/Programming/GNU-Flex-Bison/demo$ ./a.out < in.txt
SECOND
where in.txt
contains //!<
.
精彩评论