开发者

Write regular expression for C numerical literals

My homework is to write a regular expression representing the language of numerical literals from C programming language. I can use l for letter, d for digit, a for +, m for -, and p for point. Assume that there are no limits on the number of consecutive digits in any part of the expression.

Some of the examples of valid numerical literals were 13. , .328, 41.16, +45.8开发者_运维知识库0, -2.e+7, -.4E-7, 01E-06, +0

I came up with: (d+p+a+m)(d+p+E+e+a+m)*

update2: (l+d+p+a+m)(d+p+((E+e)(a+m+d)d*) )* im not sure how to prevent something like 1.0.0.0eee-e1.


Your regular expression does not support the various suffixes (l, u, f, etc.), nor does it support hexadecimal or octal constants.

The leading signs (+ or - in front of the number) are not lexically part of the constant; they are the unary + and - operators. Effectively, all integer and floating constants are positive.

If you need to fully support C99 floating constants, you need to support hexadecimal exponents (p instead of e).

Your regular expression also accepts many invalid sequences of characters, like 1.0.0.0eee-e1.

A single regular expression to match all C integer and floating literals would be quite long.


Untested, but this should be along the right lines for decimal at least. (Also, it accepts the string ".", or I think it does anyway; to fix that would eliminate the last of the common code between integer and FP, the leading [0-9]*.)

[0-9]*([0-9]([uU](ll?+LL?)+(ll?+LL?)?[uU]?)+(\.[0-9]*)?([eE][+-]?[0-9]+)[fFlL])


This Regex will match all your need:

  [+-]?(?P<Dot1>\.)?\d+(?(Dot1)(?#if_dot_exist_in_the_beginning__do_nothing)|(?#if_dot_not_exist_yet__we_accept_optional_dot_now)(?P<Dot2>\.)?)\d*(?P<Exp>[Ee]?)(?(Exp)[+-]?\d*)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜