How can be bytecode used for optimizing the execution time of dynamic languages?

2023-01-13 05:20 问答作者：

I am interested in some optimization methods or general bytecode designs, which might help speed up execution us开发者_如何转开发ing VM in comparison to interpretation of an AST.

The main win in AST interpretation vs. bytecode is operation dispatch cost, for highly optimised interpreters this starts to become a real problem. "Dispatch" is the term used to describe the overhead required to start executing an operation (such as arithmetic, property access, etc).

A fairly normal AST based interpreter would look something like this:

class ASTNode {
    virtual double execute() = 0;
}

class NumberNode {
    virtual double execute() { return m_value; }
    double m_value;
}

class AddNode {
    virtual double execute() { return left->execute() + right->execute(); }
}

So executing the code for something as simple as 1+1 requires 3 virtual calls. Virtual calls a very expensive (in the grand scheme of things) due to the multiple indirections to make the call, and the general cost of making a call in the first place.

In a bytecode interpreter you have you a different dispatch model -- rather than virtual calls you have an execution loop, akin to:

while (1) {
    switch (op.type) {
        case op_add:
            // Efficient interpreters use "registers" rather than
            // a stack these days, but the example code would be more
            // complicated
            push(pop() + pop());
            continue;
        case op_end:
            return pop();
    }
}

This still has a reasonably expensive dispatch cost vs native code, but is much faster than virtual dispatch. You can further improve perf using a gcc extension called "computed goto" which allows you to remove the switch dispatch, reducing total dispatch cost to basically a single indirect branch.

In addition to improving dispatch costs bytecode based interpreters have a number of additional advantages over AST interpreters, mostly due to the ability of the bytecode to "directly" jump to other locations as a real machine would, for example imagine a snippet of code like this:

while (1) {
    ...statements...
    if (a)
        break;
    else
        continue;
}

To implement this correctly everytime a statement is executed you would need to indicate whether execution is meant to stay in the loop or stop, so the execution loop becomes something like:

while (condition->execute() == true) {
    for (i = 0; i < statements->length(); i++) {
        result = statements[i]->execute();
        if (result.type == BREAK)
            break;
        if (result.type == CONTINUE)
            i = 0;
    }
}

As you add more forms of flow control this signalling becomes more and more expensive. Once you add exceptions (eg. flow control that can happen everywhere) you start needing to check for these things in the middle of even basic arithmetic, leading to ever increasing overhead. If you want to see this in the real world I encourage you to look at the ECMAScript spec, where they describe the execution model in terms of an AST interpreter.

In a bytecode interpreter these problems basically go away, as the bytecode is able to directly express control flow rather than indirectly through signalling, eg. continue is simply converted into a jump instruction, and you only get that cost if it's actually hit.

Finally an AST interpreter by definition is recursive, and so has to be prevented from overflowing the system stack, which puts very heavy restrictions on how much you can recurse in your code, something like:

1+(1+(1+(1+(1+(1+(1+(1+1)))))))

Has 8 levels of recursion (at least) in the interpreter -- this can be a very significant cost; older versions of Safari (pre-SquirrelFish) used an AST interpreter, and for this reason JS was allowed only a couple of hundred levels of recursion vs 1000's allowed in modern browsers.

Perhaps you could look at the various methods which the llvm "opt" tool provides. Those are bytecode-to-bytecode optimisations, and the tool itself will provide analysis on the benefits of applying a particular optimisation.

继续阅读：bytecode optimization vm-implementation

How can be bytecode used for optimizing the execution time of dynamic languages?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？