How to make correct regular expression
I want to get ${1} = Title, ${2} = Open, ${3} = Bla-bla-bla.
from
{{Title|Open
B开发者_如何转开发la-bla-bla
}}
What about something like this :
$str = <<<STR
{{Title|Open
Bla-bla-bla
}}
STR;
$matches = array();
if (preg_match("/^\{\{([^\|]+)\|([^\n]+)(.*)\}\}$/s", $str, $matches)) {
var_dump($matches);
}
It'll get you :
array
0 => string '{{Title|Open
Bla-bla-bla
}}' (length=28)
1 => string 'Title' (length=5)
2 => string 'Open' (length=4)
3 => string '
Bla-bla-bla
' (length=14)
Which means that, after using trim
on $matches[1]
, $matches[2]
, and $matches[3]
, you'll get what you asked for :-)
Explaining the regex :
- matching from the beginning of the string :
^
- two
{
characters, that have to be escaped, as they have a special meaning - anything that's not a |, at least one time :
[^\|]+
- between
()
so it's captured -- returned as the first part of the result |
has to be escaped too.
- between
- a
|
character -- that has to be escaped. - Anything that's not a line-break, at least one time :
[^\n]+
- between
()
so it's captured too -- second part of the result
- between
.*
virtually "anything" anynumber of times- between
()
so it's captured too -- third part of the result
- between
- and, finally, two
}
(escaped, too) - and an end of string :
$
And note the regex has the s
(dotall) modifier ; see Pattern Modifiers, about that.
$string = "{{Title|Open
Bla-bla-bla
}}";
preg_match('/^\{\{([^|]+)\|(.*?)[\r\n]+(.*?)\s*\}\}/', $string, $matches);
print_r($matches);
http://www.gskinner.com/RegExr/
a useful place to play around and learn regexes.
In Perl:
/\{\{ # literal opening braces
(.*?) # some characters except new line (lazy, i. e. as less as possible)
\| # literal pipe
(.*?) # same as 2 lines above
\n # new line
([\s\S]*?) # any character, including new line (lazy)
\}\}/x; # literal closing braces
Making a more precise solution depends on what exact rules you want for extraction of your fields.
精彩评论