开发者

Clean, self-contained VM implemented in C and under 100-200K compiled code size?

I'm looking for a VM with the following features:

  • Small compiled code footprint (under 200K).
  • No external dependencies.
  • Unicode (or raw) string support.
  • Clean code/well organized.
  • C(99) code, NOT C++.
  • C/Java-like syntax.
  • Operators/bitwise: AND/OR, etc.
  • Threading support.
  • Generic/portable bytecode. Bytecode should work on different machines even if it was compiled on a different architecture with different endianness etc.
  • Barebones, nothing fancy necessary. Only the basic language support.
  • Lexer/parser and compiler separate from VM. I will be embedding the VM in a program and then compile the bytecode independently.

So far I have reviewed Lua, Squirrel, Neko, Pawn, Io, AngelScript... and the only one which comes somewhat close to the spec is Lua, but the syntax is horrible, it does not have bitwise support, and the code style generally sucks. Squirrel and IO are huge, mostly. Pawn is problematic, it is small, but bytecode is not cross platform and the implementation has some serious issues (ex bytecode is not validated at all, not even the headers AFAIK).

I would love to find a suitable option out there.

Thanks!

Update: Javascrip开发者_C百科t interpreters are... interpreters. This is a VM question for a bytecode-based VM, hence the compiler/bytecode vm separation requirement. JS is interpreted, and very seldom compiled by JIT. I don't want JIT necessarily. Also, all current ECMAScript parsers are all but small.


You say you've reviewed NekoVM, but don't mention why it's not suitable for you.

It's written in C, not C++, the VM is under 10kLOC with a compiled size of roughly 100kB, and the compiler is a separate executable producing portable bytecode. The language itself has C-like syntax, bitwise operators, and it's not thread-hostile.


Finally after all this time none of the answers really did it. I ended up forking LUA. As of today no self contained VM with the above requirements exists... it's a pity ;(

Nonetheless, Pawn is fairly nice, if only the code wasn't kind of problematic.


JerryScript:

  • requires less than 64 KB of RAM
  • ~160 KB binary size
  • written in C99
  • VM based
  • has bytecode precompilation

IoT JavaScript glues JerryScript with libuv (nodejs style) - it may be easier to play with.

Threading is probably not there in a state you want. There are recent additions to ECMAScript around background workers on separate threads and shared, cross-thread buffers - not sure what's the story with it in JerryScript - probably not there yet, but who knows - they have a blueprint for how to do it, may not be far.


Try EmbedVM.

http://www.clifford.at/embedvm/

http://svn.clifford.at/embedvm/trunk/

Here's an example of some code, a guessing game. The compiler is built in C with lex+yacc:

global points;

function main()
{
    local num, guess;
    points = 0;
    while (1)
    {
        // report points
        $uf4();

        // get next random number
        num = $uf0();
        do {
            // read next guess
            guess = $uf1();
            if (guess < num) {
                // hint to user: try larger numbers
                $uf2(+1);
                points = points - 1;
            }
            if (guess > num) {
                // hint to user: try smaller numbers
                $uf2(-1);
                points = points - 1;
            }
        } while (guess != num);

        // level up!
        points = points + 10;
        $uf3();
    }
}

There isn't any threading support. But there's no global state in the VM, so it's easy to run multiple copies in the same process.

The API is simple. VM RAM is accessed via callbacks. Your main loop calls embedvm_exec(vmdata) repeatedly, it executes a single operation and returns.

The VM has a tiny footprint and has been used on 8-bit microcontrollers.


For something very "barebones" :

http://en.wikibooks.org/wiki/Creating_a_Virtual_Machine/Register_VM_in_C

More of a short introduction to the topic than anything else, granted.

Yet, it probably meets at least these few of the desired criteria :

  • Small compiled code footprint (under 200K) ... check, obviously;
  • No external dependencies ... check;
  • Clean code/well organized ... check;
  • C(99) code, NOT C++ ... check;
  • C/Java-like syntax ... check.


On option is to use something minimal and extend it. mini-vm is under 200 lines of code, including comments, it has a liberal license (MIT), it's written in C. Out of the box it supports 0 operations, but it is very easy to extend. The included example compiler is only a simple calculator. But one could easily imagine adding comparisons, branches, memory access, and supervisor calls to take it where you want to go. A VM that is easy to extend is especially useful for developing domain specific languages, and having multiple languages target your flavor of mini-vm would be straight forward other than having to implement multiple compilers (or port them. the QuakeC compiler is just lcc, and very easy to retarget).

Threading support would have to be an extension, and the core VM would not play nicely in a multiprocessor pthread scenario (heavyweight threading). Weirdly mini-vm can have a pc (program counter) per heavyweight thread, but would share registers among all threads using the same context. Running separate contexts would be thread-safe though.

I'm skipping answering the requirements on language because the question starts off asking for a barebones VM. But at the same time demands C/Java like syntax, not sure how to resolve that conflict other than stating this conflict.


Try embedding a JavaScript interpreter in your code.

http://www.mozilla.org/js/spidermonkey/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜