Equation parser efficiency
I sunk about a month of full time into a native C++ equation parser. It works, except it is slow (between 30-100 times slower than a hard-coded equation). What can I change to make it faster?
I read everything I could find on efficient code. In broad strokes:
- The parser converts a string equation expression into a list of "operation" objects.
- An operation object has two function pointers: a "getSource" and a "evaluate".
- To evaluate an equation, all I do is a for loop on the operation list, calling each function in turn.
There isn't a single if / switch encountered when evaluating an equation - all conditionals are handled by the parser when it originally assigned the function pointers.
- I tried inlining all the functions to which the function pointers point - no improvement.
- Would switching from function pointers to functors help?
- How about removing the function pointer framework, and instead creating a full set of derived "operation" classes, each with its own virtual "getSource" and "evaluate" functions? (But doesn't this just move the function pointers into the vtable?)
I have a lot of code. Not sure what to distill / post. Ask for some aspect of i开发者_运维技巧t, and ye shall receive.
In your post you don't mention that you have profiled the code. This is the first thing I would do if I were in your shoes. It'll give you a good idea of where the time is spent and where to focus your optimization efforts.
It's hard to tell from your description if the slowness includes parsing, or it is just the interpretation time.
The parser, if you write it as recursive-descent (LL1) should be I/O bound. In other words, the reading of characters by the parser, and construction of your parse tree, should take a lot less time than it takes to simply read the file into a buffer.
The interpretation is another matter. The speed differential between interpreted and compiled code is usually 10-100 times slower, unless the basic operations themselves are lengthy. That said, you can still optimize it.
You could profile, but in such a simple case, you could also just single-step the program, in the debugger, at the level of individual instructions. That way, you are "walking in the computer's shoes" and it will be obvious what can be improved.
Whenever I'm doing what you're doing, that is, providing a language to the user, but I want the language to have fast execution, what I do is this: I translate the source language into a language I have a compiler for, and then compile it on-the-fly into a .dll (or .exe) and run that. It's very quick, and I don't need to write an interpreter or worry about how fast it is.
The very first thing is: Profile what actually went wrong. Is the bottleneck in parsing or in evaluation? valgrind offers some tools that can help you here.
If it's in parsing, boost::spirit might help you. If in evaluation, remember that virtual functions can be pretty slow to evaluate. I've made pretty good experiences with recursive boost::variant's.
You know, building an expression recursive descent parser is really easy, the LL(1) grammar for expressions is only a couple of rules. Parsing then becomes a linear affair and everything else can work on the expression tree (while parsing basically); you'd collect the data from the lower nodes and pass it up to the higher nodes for aggregation.
This would avoid altogether function/class pointers to determine the call path at runtime, relying instead of proven recursivity (or you can build an iterative LL parser if you wish).
It seems that you're using a quite complicated data structure (as I understand it, a syntax tree with pointers etc.). Thus, walking through pointer dereference is not very efficient memory-wise (lots of random accesses) and could slow you down significantly. As Mike Dunlavey proposed, you could compile the whole expression at runtime using another language or by embedding a compiler (such as LLVM). For what I know, Microsoft .Net provides this feature (dynamic compilation) with Reflection.Emit and Linq.Expression trees.
This is one of those rare times that I'd advise against profiling just yet. My immediate guess is that the basic structure you're using is the real source of the problem. Profiling the code is rarely worth much until you're reasonably certain the basic structure is reasonable, and it's mostly a matter of finding which parts of that basic structure can be improved. It's not so useful when what you really need to do is throw out most of what you have, and basically start over.
I'd advise converting the input to RPN. To execute this, the only data structure you need is a stack. Basically, when you get to an operand, you push it on the stack. When you encounter an operator, it operates on the items at the top of the stack. When you're done evaluating a well-formed expression, you should have exactly one item on the stack, which is the value of the expression.
Just about the only thing that will usually give better performance than this is to do like @Mike Dunlavey advised, and just generate source code and run it through a "real" compiler. That is, however, a fairly "heavy" solution. If you really need maximum speed, it's clearly the best solution -- but if you just want to improve what you're doing now, converting to RPN and interpreting that will usually give a pretty decent speed improvement for a small amount of code.
精彩评论