开发者

Parsing Strings/Tokens

I was wondering what the most efficient way of parsing strings would be for protocols like HTTP, FTP, SMTP, IMAP, IR开发者_运维问答C, etc. where communication is done by sending information to a server, and reading the response.

For example, let's say I would like to parse a typical IRC message.

    PING irc.example.com

What I am doing right now is dividing the response string into tokens, and iterating through them. If the token is "PING", my program calls the pong function. However, at the moment, "parsing" these strings merely consists of a bunch of strcmp()s.

I am curious for any alternative, more efficient ways of 'parsing' data (I was thinking something like a Map for tokens so my program can easily look it up).


Define a grammar for it, or simply make an automata that detects your tokens. Example in this post.


Depending on how much you want to support, you've got a few options. At the first level is simple tokenizing like what you're doing. This only works for a very limited set of commands. Next up you have regular expressions which may give you a bit more flexibility. Finally you've got a full blown grammar as suggested, which would allow for the greatest flexibility.

The complexity of each of these is bigger than the last.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜