Regexp: remove all tags from string except one kind of tags
I have such s开发者_如何转开发tring
<p>test <span class=\"match\">match</span> <span class=\"testtes\">dddddd</span></p>
I want to get string without tags. But I want to save highlighting by class "match"
:
test <span class=\"match\">match</span> dddddd
If I want to just remove all tags I substitute all substrings that satisfied regexp /<\/?[^>]*>/
by empty string. But what regexp should I use in my special case?
UPD: The algorithm is: if you see and some sentence without tags and then then you shouldn't remove these spans; otherwise you should remove all tags
I can could do someting like this
<\/?(?![^>]*class=\\"match)[^>]*>
This would preserve the opening tag and result in this
test <span class=\"match\">match dddddd
See it here on Regexr
But how should I find the matching closing tag?
<p>test <span class=\"match\">match</span> <span class=\"testtes\">dddddd</span></p>
^^^^^^^ or the next one? ^^^^^^^
Regex can't know which closing tag belongs to the opening <span>
tag that contains that class. I don't have the possibility to find matching closing tags. So its not a good idea to do this using regex.
I am quite sure the language you are using has an html parser that can be used to do this task.
精彩评论