Trying to pick out a specific part of a string with regex
I've tried and tried again开发者_如何学运维 to find a regex for this pattern. I have a string like this picked from HTML source.
<!-- TAG=Something / Something else -->
And sometimes it's just:
<!-- TAG=Something -->
In both cases I want the regex to just match "Something", i.e. everything between TAG= and an optional /.
My first attempt was:
TAG=(.*)[/]?(.*) -->
But the first parenthesis matches everything between TAG= and --> no matter what. So what is the correct way here?
Try this:
TAG=([^/]*)(?:/(.*))?-->
Group 1 will contain "Something".
Group 2 will contain "Something else" or null.
Test it.
<!--.*?=(.*?)(-->|/)
It matches everything you need.
Use a non-greedy modifier ?
:
TAG=(.*?)[/]?.* -->
Also your usage of [/]
seems unusual - you don't need a character class to write a single character. The most likely explanation for this unusual syntax is probably because you are using /
as the regular expression delimiter, meaning that /
is treated as a special character. In many (not all) regex dialects it is possible solve this issue by using a different delimiter, such as #
. This prevents you from needing to escape the slashes.
精彩评论