开发者

Order of precedence for token matching in Flex

My apologies if the title of this thread is a little confusing. What I'm asking about is how does Flex (the lexical analyzer) handle issues of precedence?

For example, let's say I have two tokens with similar regular expressions, written in the following order:

"//"[!\/]{1}    return FIRST;
"//"[!\/]{1}\<  return SECOND;

Given the input "//!<", will FIRST or SECOND be returned? Or both?

The FIRST string would be reached before the SECOND string, but it seems 开发者_Python百科that returning SECOND would be the right behavior.


The longest match is returned.

From flex & bison, Text Processing Tools:

How Flex Handles Ambiguous Patterns

Most flex programs are quite ambiguous, with multiple patterns that can match the same input. Flex resolves the ambiguity with two simple rules:

  • Match the longest possible string every time the scanner matches input.
  • In the case of a tie, use the pattern that appears first in the program.

You can test this yourself, of course:

file: demo.l

%%
"//"[!/]   {printf("FIRST");}
"//"[!/]<  {printf("SECOND");}
%%

int main(int argc, char **argv)
{
    while(yylex() != 0);
    return 0;
}

Note that / and < don't need escaping, and {1} is redundant.

bart@hades:~/Programming/GNU-Flex-Bison/demo$ flex demo.l 
bart@hades:~/Programming/GNU-Flex-Bison/demo$ cc lex.yy.c  -lfl
bart@hades:~/Programming/GNU-Flex-Bison/demo$ ./a.out < in.txt 
SECOND

where in.txt contains //!<.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜