PHP regex optimize
I've got a regular expression that match everything between <anything>
and I'm using this:
'@<([\w]+)>@'
today but I believe that there might be a better way to do it?开发者_运维百科
/ Tobias
\w
doesn't match everything like you said, by the way, just [a-zA-Z0-9_]
. Assuming you were using "everything" in a loose manner and \w
is what you want, you don't need square brackets around the \w
. Otherwise it's fine.
If "anything" is "anything except a >
char", then you can:
@<([^>]+)>@
Testing will show if this performs better or worse.
Also, are you sure that you need to optimize? Does your original regex do what it should?
You better use PHP string functions for this task. It will be a lot faster and not too complex.
For example:
$string = "abcd<xyz>ab<c>d";
$curr_offset = 0;
$matches = array();
$opening_tag_pos = strpos($string, '<', $curr_offset);
while($opening_tag_pos !== false)
{
$curr_offset = $opening_tag_pos;
$closing_tag_pos = strpos($string, '>', $curr_offset);
$matches[] = substr($string, $opening_tag_pos+1, ($closing_tag_pos-$opening_tag_pos-1));
$curr_offset = $closing_tag_pos;
$opening_tag_pos = strpos($string, '<', $curr_offset);
}
/*
$matches = Array ( [0] => xyz [1] => c )
*/
Of course, if you are trying to parse HTML or XML, use a XHTML parser instead
That looks alright. What's not optimal about it?
You may also want to consider something other regex if you're trying to parse HTML: RegEx match open tags except XHTML self-contained tags
精彩评论