Strip all attributes from an html and only return the tag name using regex
As the title says, how can I achieve the following in ru开发者_C百科by using regex or some other ruby magic?
Input
<a href="#" class="css-class">Link</a>
<img src="image.jpg" />
Desired Output
a
img
Thanks in advance
I don't know how regex matching is handled in ruby, but i'm pretty sure that you can retrieve groups out of the regex.
For your case the regex:
<([^\s]*).*(</.*>|/>)
should do the trick.
After using it on your inputstring there will be only the tag names in group #1 for every match.
I agree with Tomalak, but if you still want to go with the regex approach, you could use something like the following:
\<(?<tag>[^ ]+)[^\>/]*(\>[^\<]*</\k<tag>\>|/\>)
I tested it only with the C# regex engine, I hope it works for ruby, too.
精彩评论