开发者

Extract function's argument from PHP source code

开发者_如何转开发

From a shell script, I'd like to be able to extract a function's argument from PHP source code, eg:

->getUrl(<extract-me>, false)

but with the issue that <extract-me> can be anything that PHP allows, like opening or closing parenthesis, but also more complicated things...

Thank you


Since your question is tagged "perl", I assume you'll accept a Perl solution.

My first thought was to use the module Text::Balanced, in particular: extract_codeblock (which is actually designed for Perl, but Perl and PHP are similar enough to parse to get acceptable results) but it was not as easy as I had hoped for: extract_codeblock only properly pulled out the expression if it was bracketed (starts with "{" or "(").

Well, by using that module and writing my own sub combining tokenparsing routines, I got something that superficially appears to work.

use Text::Balanced qw(extract_bracketed extract_quotelike);
sub extract_expression {
    local $_ = shift;
    my $parsed;
    while(1) {
        if(s&^(\s*)((?:(?!//|/\*|#)[^[{(\]),;'"'\s])+)&&) {
            # normal characters (no delimiters, quotes or brackets, or comments)
            $parsed .= "$1$2";
        } elsif(/^\s*(?=['"'])/) {
            # quotes
            (my $token, $_) = extract_quotelike($_, '\'"');
            defined $token or last;
            $parsed .= $token;
        } elsif(/^\s*(?=[\[\{\(])/) {
            # brackets
            (my $token, $_) = extract_bracketed($_, '[({\'"})]');
            defined $token or last;
            $parsed .= $token;
        } elsif(s&^\s*(?://|\#).*\n?&& || s&^\s*/\*.*?\*/&&s) {
            # comments
            # ignore
        } else {
            # not recognized
            # finished
            last;
        }
    }
    return $parsed, $_;
}

# demo
# complex line of PHP (borrowed from Drupal)
$_ = <<'PHP';
$translations[$lang] = $this->drupalCreateNode(array('type' => $source->type, 'language' => $lang, 'translation_source' => $source, 'status' => $source->status, 'promote' => $source->promote, 'uid' => $source->uid));
# etc
PHP

if(/->drupalCreateNode\(/) {
    my $offset = $+[0];   # position right after opening paren
    my($expression, $rest) = extract_expression(substr($_, $offset));
    if(defined $expression) {
        print <<"INFO";
parsed expression: $expression
rest: $rest
INFO
    } else {
        print "Failure to parse expression\n";
    }
}

Well, I have been thinking... comments in PHP are different from comments in Perl, and superficially I have handled PHP comments, but unfortunately the routine isn't recursive and only comments outside any bracketed expression are properly ignored. Internal comments might confuse the parser (extract_bracketed).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜