开发者

How can I name and organize methods used by a finite state machine?

In the following code you'll see a simple lexer that conforms to the following regular expression:

 \d*(\.\d*)?([eE]([+-]\d+|\d+))?

If I were to use this design for something more complex, all of the anonymous delegates would be a nightmare to maintain. The biggest challenge I am facing is what to name the methods that would act as choice points in the state machine. In the variable exponentPart the last anonymous delegate passed to MatchOne will decide whether we have a signed integer, an integer, or a false match. Please post any ideas on how I can organize such a project assuming a c开发者_运维知识库omplex language with lots of shared symbols.

static void Main(string[] args)
{
    var exponentPart =
        Lex.Start()
        .MatchOne(s => s.Continue(s.Current == 'e' || s.Current == 'E'))
        .MatchOne(
            s => // What would I name this?
            {
                if (char.IsDigit(s.Current))
                {
                    return Lex.Start().MatchZeroOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
                }
                else if (s.Current == '+' || s.Current == '-')
                {
                    return Lex.Start().MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
                }
                else
                {
                    return s.RememberedState();
                }
            }
        );

    var fractionalPart =
        Lex.Start()
        .MatchOne(s => s.Continue(s.Current == '.'))
        .MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))
        .Remember()
        .MatchOne(exponentPart);

    var decimalLiteral =
        Lex.Start()
        .MatchOneOrMore(s => s.Continue(char.IsDigit(s.Current)))
        .Remember()
        .MatchOne(
            s => // What would I name this?
            {
                if (s.Current == '.')
                {
                    return fractionalPart(s);
                }
                else if (s.Current == 'e' || s.Current == 'E')
                {
                    return exponentPart(s);
                }
                else
                {
                    return s.RememberedState();
                }
            }
        );

    var input = "999.999e+999";
    var result = decimalLiteral(new LexState(input, 0, 0, 0, true));

    Console.WriteLine(result.Value.Substring(result.StartIndex, result.EndIndex - result.StartIndex + 1));
    Console.ReadLine();
}


When trying to write some sort of parser, you should first divide your expression into rules and terminals. Then you can name the methods by the rules they check. For example, something along the lines of:

<literal> := <fractional> | <fractional_with_exponent>
<fractional> := \d*(\.\d*)?
<fractional_with_exponent> := <fractional><exponent>
<exponent> := [eE]([+-]\d+|\d+)

This would give you methods named Literal(), Fractional(), FractionalWithExponent() and Exponent() each able to recognize or reject their own rules. Literal() would call Fractional() and FractionalWithExponent() and decide which one does not reject, etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜