开发者

Lex: How do I Prevent it from matching against substrings?

For example, I'm supposed to convert "int" to "INT". But if there's the word "integer", I don't think it's supposed to turn into "INTeger".

If I define "int" printf("INT"); the substrings are matched though. Is there a way to prevent thi开发者_高级运维s from happening?


I believe the following captures what you want.

%{
#include <stdio.h>
%}

ws                      [\t\n ]

%%

{ws}int{ws}         { printf ("%cINT%c", *yytext, yytext[4]); }
.                       { printf ("%c", *yytext); }

To expand this beyond word boundaries ({ws}, in this case) you will need to either add modifiers to ws or add more specifc checks.


well, here's how i did it:

(("int"([a-z]|[A-Z]|[0-9])+)|(([a-z]|[A-Z]|[0-9])+"int")) ECHO;
"int" printf("INT");

better suggestions welcome.


Lex will choose the rule with the longest possible match for the current input. To avoid substring matches you need to include an additional rule that is longer than int. The easiest way to do to this is to add a simple rule that picks up any string that is longer than one character, i.e. [a-zA-Z]+. The entire lex program would look like this:-

%%

[\t ]+          /* skip whitespace */
int { printf("INT"); }
[a-zA-Z]+       /* catch-all to avoid substring matches */

%%

int main(int argc, char *argv[])
   {
   yylex();
   }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜