开发者

Should I use/write a template lexer

I'm using a PHP template engine I've written some time ago. It relies on regexes to create a cached PHP file. Some examples of the syntax:

{$foo} - regular variable
{$foo.bar} - variable foo that uses the array key 'bar'
{$foo|uppercase} - modifier 'uppercase' that takes 'foo' and applies some method to it

{iteration:users}
    Hi there {$users.name}
{/iteration: users}

The list goes on... There's quite an amount of nasty regexes involved to parse all this. Note that an iteration can be inside another iteration and so on.

Recently I've been seei开发者_如何转开发ng template engines like twig, smarty3, that use a template lexer. I have a few questions about this: - In general isn't the lexer way slower than using a few regexes to create a cached php template? - Are there good resources on how to write your own lexer to interpret some sort of (template) language (I couldn't find anything I understand on google) - Should I keep using regexes or is a lexer something worth exploring?


I suggest writing Parsing expression grammars (PEGs), and see this answer for a PEG library in PHP.

PEGs are very much alike Regular Expressions, they are greedy by nature, and never ambiguous: great for a Domain Specific Language (DSL).

In general isn't the lexer way slower than using a few regexes to create a cached php template?

No: the speed of regular expressions are implementation dependent of the Regular Expression engine. In general, every time you use a Regular Expression, it needs to be parsed itself, and then with the given model, it must use a general matcher, that works with all Regular Expressions possible.

Given a lexer, you fine-tune the matcher: you get a specific matcher, which only works for your predefined grammar. One gain is in the bootstrap case: no need to compile the Regular Expression. Another gain is in it's lesser complexity, due to it's specific matcher, which tends to run faster.

Are there good resources on how to write your own lexer to interpret some sort of (template) language (I couldn't find anything I understand on google)?

Lexers are quite complex. To write your own you will have to know stuff about state machines, regular grammar, context-free or non-context-free grammers, etc.

It requires some fundamental computer science knowledge before it's easy to grasp though.

Should I keep using regexes or is a lexer something worth exploring?

Worth noting is the error-catching capabilities of well engineered lexers (e.g. an error message: "expected ;, but found ), on line 64:38.")

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜