开发者

regexec and regcomp more efficient than doing strncmp myself?

I have a string like this:

I am down in the town seeing a crown="larry" with a cherry="red"

I want to write a program that asks user what she wants. If she requests the string that should have "larry" as crown and "red" cherry, I need to return the string.

Okay, I am over simplifying the problem here. There can be many such stri开发者_开发百科ngs and I need to parse through them and return all that matches.

Question: doing regexec and regcomp is more efficient or breaking down the string and doing strncmp?

PS: It seems that regexec would need to do some sort of comparison internally and those would have been designed to be much efficient.


I think strncmp() is simply the wrong tool for the job; if you'd said strstr(), there might have been room for discussion. You can't use strncmp() easily because you have to find a position to start it comparing at.

If you used strstr(), you'd be looking for strings such as:

crown="larry"
cherry="red"

If you use a regex, you have to compile it, and run it. If you are searching for the two strings, you have two regexes, unless you want to write a contorted regex. I think that for simple comparisons where you need both the strings above in either order, you might find two uses of strstr() quicker than one or two regexes.

It is worth measuring the difference, though. It may depend on the implementation of strstr(); some are very good. So, run measurements on the platforms you are concerned with, and choose which works better for you.


Since you are probably compiling a new regex each time you'll do a regexec(), that will probably be a bit slower than using strncmp() to check for the keyword, e.g. "crown=" and then checking if the value is "\"larry\"".

I assume you could build a system that parses the keywords and values beforehand and keeps some kind of list, dictionary or some such pointing to the string, or vice versa (each string is associated with a set of keyword="value" combinations). That could be done once, and would making the work during search easier.

But I don't know enough of your goals and your existing code to know if that makes sense for your situation.

In other words, you would have to profile this to be sure, but I guess that strncmp()would be more performant than the regcomp() and regexec() combinations. Regular expressions are, of course, far more flexible, but I don't think you need that here.

Addition

Assuming that '=' is not a character that will be found in your lines very often, you can of course use strchr() to find each occurrence of '=' in the string, and then check if the next character is '\"'. Then you can scan backward to see if the key matches. strchr()is very likely a lot faster than strncmp().

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜