Ruby split with regex - regex isn't doing what i want
i have this string
string = "<p>para1</p><p>para2</p><p>para3</p>"
I want to split on the para2 text, so that i get this
["<p>para1</p>", "<p>para3</p>"]
The catch is that sometimes para2 might not be wrapped in p tags (and there might be optional spaces outside the p and inside it). I thought that this would do it:
string.split(/\s*(<p>)?\s*para2\s*(<\/p>)?\s*开发者_运维技巧/)
but, i get this:
["<p>para1</p>", "<p>", "</p>", "<p>para3</p>"]
it's not pulling the start and end p tags into the matching pattern - they should be eliminated as part of the split. Ruby's regular expressions are greedy by default so i thought that they would get pulled in. And, this seems to be confirmed if i do a gsub instead of a split:
string.gsub(/\s*(<p>)?\s*para2\s*(<\/p>)?\s*/, "XXX")
=> "<p>para1</p>XXX<p>para3</p>"
They are being pulled in and got rid of here, but not on the split. Any ideas anyone?
thanks, max
Replace your capturing groups (…)
with non-capturing groups (?:…)
:
/\s*(?:<p>)?\s*para2\s*(?:<\/p>)?\s*/
精彩评论