开发者

A PHP regex to extract php functions from code files

I'm trying to make a PHP regex to extract functions from php source code. Until now i used a recursive regex to extract everything between {} but then it also matches stuff like if statements. When i use something like:

preg_match_all("/(function .*\(.*\))({([^{}]+|(?R))*})/", $data, $matches);

It doesn't work when there is more than 1 function in the file (probably because it uses the 'function' part in the recu开发者_StackOverflow社区rsiveness too).

Is there any way to do this?

Example file:

<?php
if($useless)
{
  echo "i don't want this";
}

function bla($wut)
{
  echo "i do want this";
}
?>

Thanks


regexps is the wrong way to do it. Consider tokenizer or reflection


Moved here from duplicate question: PHP, Regex and new lines

Regex solution:

$regex = '~
  function                 #function keyword
  \s+                      #any number of whitespaces 
  (?P<function_name>.*?)   #function name itself
  \s*                      #optional white spaces
  (?P<parameters>\(.*?\))  #function parameters
  \s*                      #optional white spaces
  (?P<body>\{.*?\})        #body of a function
~six';

if (preg_match_all($regex, $input, $matches)) {
  print_r($matches);
}

P.S. As was suggested above tokenizer is preferable way to go.


Regex accepting recursive curly brackets in body

I know there is a selected answer, but in case tokenizer can not be used this is a simple regex to extract function (name, param and body) from php code.

Main difference with Ioseb answer above is that this regex accepts cases with recursive curly brackets in the body, means that it won't stop after the first curly brackets closing.

/function\s+(?<name>\w+)\s*\((?<param>[^\)]*)\)\s*(?<body>\{(?:[^{}]+|(?&body))*\})/

Explanation

/                                   # delimiter
function                            # function keyword
\s+                                 # at least one whitespace
(?<name>\w+)                        # function name (a word) => group "name"
\s*                                 # optional whitespace(s)
\((?<param>[^\)]*)\)                # function parameters => group "param"
\s*                                 # optional whitespace(s)
(?<body>\{(?:[^{}]+|(?&body))*\})   # body function (recursive curly brackets allowed)  => group "body"
/                                   # delimiter

Example

$data = '
    <?php 
    function my_function($param){
        if($param === true){
            // This is true
        }else if($param === false){
            // This is false
        }else{
            // This is not
        }
    }
    ?>
';

preg_match_all("/function\s+(?<name>\w+)\s*\((?<param>[^\)]*)\)\s*(?<body>\{(?:[^{}]+|(?&body))*\})/", $data, $matches);
print_r($matches['body']);

/*
Array
(
    [0] => {
        if($param === true){
            // This is true
        }else if($param === false){
            // This is false
        }else{
            // This is not
        }
    }
)
*/

Limitation

Curly brackets have to be balanced. ie, this body will be partially extracted :

function my_function(){
    echo "A curly bracket : }";
    echo "Another curly bracket : {";
}

/*
Array
(
    [0] => {
    echo "A curly bracket : }
)
*/
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜