Matching all excerpts which starts and ends with specific words
I have a text which looks like:
some non interesting part
trans-top
body of first excerpt
trans-bottom
next non interesting part
trans-top
body of second excerpt
trans-bottom
non interesting part
And I want to extract all excerpts starting with trans-top and ending with trans-bottom into an array. I tried that:
match(/(?=trans-top)(.|\s)*/g)
to find strings witch starts with trans-top. And it works. Now I want to specify the end:
match(/(?=trans-top)(.|\s)*(?=trans-bottom)/g)
and it doesn't. Firebug gives me an erro开发者_JAVA百科r:
regular expression too complex
I tried many other ways, but I can't find working solution... I'm shure I made some stupid mistake:(.
This works pretty well, but it's not all in one regex:
var test = "some non interesting part\ntrans-top\nbody of first excerpt\ntrans-bottom\nnext non interesting part\ntrans-top\nbody of second excerpt\ntrans-bottom\nnon interesting part";
var matches = test.match(/(trans-top)([\s\S]*?)(trans-bottom)/gm);
for(var i=0; i<matches.length; i++) {
matches[i] = matches[i].replace(/^trans-top|trans-bottom$/gm, '');
}
console.log(matches);
If you don't want the leading and trailing linebreaks, change the inner loop to:
matches[i] = matches[i].replace(/^trans-top[\s\S]|[\s\S]trans-bottom$/gm, '');
That should eat the linebreaks.
This tested function uses one regex and loops through picking out the contents of each match placing them all in an array which is returned:
function getParts(text) {
var a = [];
var re = /trans-top\s*([\S\s]*?)\s*trans-bottom/g;
var m = re.exec(text);
while (m != null) {
a.push(m[1]);
m = re.exec(text);
}
return a;
}
It also filters out any lealding and trailing whitespace surrounding each match contents.
精彩评论