C#: How to find tags with regex and collect a List<string>
I have a text string like this:
<dt>
<span>
<tag:text name="fee" />
</span>
</dt>
...
<tag:text name="amount" />
what I want is to find all tags of type
<tag:text
and then collect a List with the values fro开发者_如何学Cm the name element:
"fee"
"amount"
(?<=tag:text name=").*(?=")
test with grep:
kent$ echo '<dt>
<span>
<tag:text name="fee" />
</span>
</dt>
...
<tag:text name="amount" /> '|grep -Po '(?<=tag:text name=").*(?=")'
fee
amount
Here's the regex you (EDIT: probably don't) want, just make sure to turn off case sensitivity:
<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)</\1>
Then just use the System.Text.RegularExpressions.Regex
class to perform the regex operation:
MatchCollection mc = Regex.Matches(str, @"<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)</\1>", RegexOptions.IgnoreCase);
See here for more info: http://msdn.microsoft.com/en-us/library/b49yw9s8.aspx
Note: You may need to tweak the regex to work with the colons in your text.
Note #2: As mentioned above, if this is XML then you should parse it using the appropriate XML classes from the System.Xml
namespace.
精彩评论