开发者

How do interpreters written in C and C++ bind identifiers to C(++) functions

I'm talking about C and/or C++ here as this are the only languages I know used for interpreters where the following could be a problem:

If we have an interpreted language X how can a library written for it add functions to the language which can then be called from within programs written in the language?

PHP example:

substr( $str, 5, 10 );
  • How is the function substr added to the "function pool" of PHP so it can be called from within scripts?

It is easy for PHP storing all registered function names in an array and searching through it as a function is called in a script. However, as there obviously is no eval in C(++), how can the f开发者_StackOverflow中文版unction then be called? I assume PHP doesn't have 100MB of code like:

if( identifier == "substr" )
{
   return PHP_SUBSTR(...);
} else if( ... ) {
   ...
}

Ha ha, that would be pretty funny. I hope you have understood my question so far.

  • How do interpreters written in C/C++ solve this problem?
  • How can I solve this for my own experimental toy interpreter written in C++?


Actually scripting languages do something like what you mentioned.
They wrap functions and they register that functions to the interpreter engine.

Lua sample:

static int io_read (lua_State *L) {
  return g_read(L, getiofile(L, IO_INPUT), 1);
}


static int f_read (lua_State *L) {
  return g_read(L, tofile(L), 2);
}
...
static const luaL_Reg flib[] = {
  {"close", io_close},
  {"flush", f_flush},
  {"lines", f_lines},
  {"read", f_read},
  {"seek", f_seek},
  {"setvbuf", f_setvbuf},
  {"write", f_write},
  {"__gc", io_gc},
  {"__tostring", io_tostring},
  {NULL, NULL}
};
...
luaL_register(L, NULL, flib);  /* file methods */


Interpreters probably just keep a hashmap of function names to the function definition (which will include parameter information, return type, function location/definition etc.) That way, you can just do a search on the hashmap for a function name (when your interpreter encounters one). If it exists, use the function info in the hashtable to evaluate it.

You obviously need to add provisions for different levels of scope, etc. but that's the gist of it.


Pretty much all compilers have a "symbol table" that they use to look up what an identifier represents. The symbol table will hold function name, variable names, type names, etc... Anything that has a name goes in a symbol table, which is basically a map of names to everything the compiler knows about that name (I'm simplifying here). Then when the compiler encounters an identifier, it look it up in the symbol table, and finds out that it's a function. If you're using an interpreter, then the symbol table will have information on where to find the function and continue interpretation. If this is a compiler, the symbol table will have an address of where that function will be in the compiled code (or a placeholder to fill in the address later). Assembly can then be produced that essentially says: put the arguments on the stack, and resume execution at some address.

So, for you're example an interpreter would look at

substr( $str, 5, 10 );

and find "substr" in it's symbol table:

symbolTableEntry entry = symbolTable["substr"];

from there, it will gather up $str, 5 and 10 as arguments, and look at entry to see that the arguments are valid for the function. Then it will look in entry to find out where to jump to with the marshalled arguments.


In C++ you'd probably use a similar mechanism as Nick D did, but taking advantage of its OO capabilities:

typedef luaFunction boost::function<void(*)(lua_State&)>
std::map<std::string, luaFunction > symbolTable;
symbolTable["read"] = f_read;
symbolTable["close"] = f_close; // etc.
// ...
luaFunction& f = symbolTable[*symbolIterator++];
f(currentLuaState);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜