开发者

How to parse a function and its arguments

For my lexer I'm using the boost::wave lexical iterator which gives me all the tokens from a .cpp, .h .hpp etc. file.

Now I want to find if a set of tokens i.e. an identifier followed by open parenthesis and then set of arguments separated by comma and finally closed parenthesis, is a function in a C++ program. I mean how should I analyze the set of tokens to make sure I have a function?

I am trying to implement this using a recursive descent parser. Till now my recursive descent parser can parse arithmetic expressions and take care of almost all kinds of operator precedence.

Or is there a function (in boost::wave) which can directly parse a function for me?

Also it would be helpful if somebody can suggest how I can find the type variable in the function argument. e.g. if I have a function:

int myfun(char* c, T& t1) { //... }

then how can I get tokens of char and * which can be treated as type of c. Similarly tokens of T and & which can be treated as type of t1?

EDIT: Here is a little more explanation to my question

references:

the boost wave documentation

http://www.boost.org/doc/libs/1_47_0/libs/wave/index.html

list of token identifiers

http://www.boost.org/doc/libs/1_47_0/libs/wave/doc/token_ids.html

typedef boost::wave::cpplexer::lex_token<> token_type;
typedef boost::wave::cpplexer::lex_iterator<token_type> token_iterator;
typedef token_type::position_type position_type;

position_type pos(filename);

//instr is the input file stream
token_iterator  it = token_iterator(instr.begin(), instr.end(), pos,
      boost::wave::language_support(
        boost::wave::support_cpp|boost::wave::support_option_long_long));
token_iterator  end = token_iterator();

//while it != end 
//...
boost::wave::token_id id = boost::wave::token_id(*it);

switch(id){
//...

    case boost::wave::T_IDENTIFIER:
      Match(id);//consumes one token and increments the token_iterator
        //get the token id of the next token       
      id = boost::wave开发者_开发百科::token_id(*it);
 //if an identifier is immediately followed by T_LEFTPAREN then it will be a function
      if(id == boost::wave::T_LEFTPAREN) {
        Match(id);                         (1)
        //this function i want to implement
        ParseFunction();                   (2) 
        Match(boost::wave::T_RIGHTPAREN);
      }
//...
}

So the question is how to implement the function ParseFunction()


If your system is POSIX-compliant (Linux, MacOSX, Solaris, ...) you can use dlopen/dlsym to determine whether the symbol exists. You need to watch out for name mangling, and on some systems you need to beware that [for example] the real name of sin is _sin.

Whether dlsym returns a pointer to a function or a pointer to some global variable — dlsym is clueless. In fact, you will have to do something that is very much contrary to both the C and C++ standards to use dlsym: you will have to cast the void* pointer returned by dlsym to a function pointer. The POSIX standard is in conflict with C/C++. That said, if you are on a POSIX-compliant system, those void* pointers will convert to a function pointer (otherwise the system is not POSIX-compliant).


Edit:

A huge gotcha: How do you call the thing you just found? How to you know how to handle the returned value, if there is any?

A simple example: suppose your input file contains xsq = pow (x, 2). You have to know ahead of time that the signature of pow is double pow (double, double).

Rather than using dlsym you are much better off handling a limited set of functions that you expressly build into your parser.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜