Regular expression replacing only if contained withing a regular expression match?
I have the following:
[list]
[*] test
[*] test
[*] test
[/list]
and I would like to create a regular expression that turns that into:
<ul>
<li>test</li>
<li>test</li>
<li>test</li>
</ul>
I know regex enough to replace simple tags, but in this case I need to replace li tags only if they are contained inside ul. Is there a way to check that before replacing?
I开发者_开发技巧 am using JavaScript if that matters.
Given the text:
[*] test1
[list]
[*] test2
[*] test3
[*] test4
[/list]
[*] test5
the regex:
\[\*]\s*([^\r\n]+)(?=((?!\[list])[\s\S])*\[/list])
matches only [*] test2
, [*] test3
and [*] test4
. But if the [list]
's can be nested, or a more broader set of a BB-like language needs to be parsed, I opt for a proper parser.
To do the replacements, replace the regex I suggested with:
<li>$1</li>
and then replace [list]
with <ul>
and [/list]
with </ul>
(assuming [list]
and [/list]
are only used for lists and are not present in comments or string literals or something).
When running the following snippet:
var text = "[*] test1\n"+
"\n"+
"[list]\n"+
"[*] test2\n"+
"[*] test3\n"+
"[*] test4\n"+
"[/list]\n"+
"\n"+
"[*] test5\n"+
"\n"+
"[list]\n"+
"[*] test6\n"+
"[*] test7\n"+
"[/list]\n"+
"\n"+
"[*] test8";
print(text + "\n============================");
text = text.replace(/\[\*]\s*([^\r\n]+)(?=((?!\[list])[\s\S])*\[\/list])/g, "<li>$1</li>");
text = text.replace(/\[list]/g, "<ul>");
text = text.replace(/\[\/list]/g, "</ul>");
print(text);
the following is printed:
[*] test1
[list]
[*] test2
[*] test3
[*] test4
[/list]
[*] test5
[list]
[*] test6
[*] test7
[/list]
[*] test8
============================
[*] test1
<ul>
<li>test2</li>
<li>test3</li>
<li>test4</li>
</ul>
[*] test5
<ul>
<li>test6</li>
<li>test7</li>
</ul>
[*] test8
A small explanation might be in order:
\[\*]\s*
matches the sub string[*]
followed by zero or more white space characters;([^\r\n]+)
gobbles up the rest of the line and saves it in match group 1;(?=((?!\[list])[\s\S])*\[/list])
ensures that every match group 1 must have a sub string[/list]
ahead of without encoutering a[list]
EDIT
Or better yet, do as Gumbo suggest in the comment to this answer: match all [list] ... [/list]
and then replace all [*] ...
in those.
Here’s a better approach to Bart K.’s suggestion:
- find all
[list] … [/list]
- for each match, find all
[*]
in it
This will ensure that only [*]
in [list] … [/list]
will be replaced.
The code:
str.replace(/\[list]([\s\S]*?)\[\/list]/g, function($0, $1) {
return "<ul>" + $1.replace(/^ *\[\*] *(.*)/gm, "<li>$1</li>") + "</ul>";
})
精彩评论