Match a PHP class with a Regular Expression
I wanna catch Php classes from a file:
class a {
function test() { }
}
class b ex开发者_如何学Gotends a {
function test() { }
}
and the result matches must be
class a {
function test() { }
}
and
class b extends a {
function test() { }
}
regexps are poor at parsing programming languages' grammars. Consider tokenizer functions instead. e.g. http://php.net/manual/en/function.token-get-all.php see also this http://framework.zend.com/apidoc/core/Zend_Reflection/Zend_Reflection_File.html
A single regex won't do this. PHP is a more complex language than regex (insert something about context-free and regular grammars here). It'll drive you crazy to even try, unless you alter your source code to make it easier for the regex to match.
Here's what you should use:
http://www.php.net/manual/en/function.token-get-all.php
Use token_get_all
to get the array of language tokens of the PHP code. Then iterate it and look for a token with the value of T_CLASS that represents the class
key word (this does not take abstract classes or the visibility into account). The next T_STRING token is the name of the class. Then look for the next plain token that’s value is {
, increase a counter for the block depth and decrease it with every plain }
token until visited the same amount of closing braces as opening braces (your counter is then 0). Then you have walked the whole class declaration.
The next Regex worked for now:
^(?:(public|protected|private|abstract)\s+)?class\s+([a-z0-9_]+)(?:\s+extends\s+([a-z0-9_]+))?(?:\s+implements\s+([a-z0-9_]+))?.+?{.+?^}
Needs:
case insensitive | ^$ match at line breaks | dot matches new lines
This only works if "class" and the last "}" don't have indent.
here the official way:
^[a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*$
from https://www.php.net/manual/en/language.oop5.basic.php
so it would be:
class[\s]{1,}[a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*
精彩评论