开发者

Parsing XML using regex and grabbing the value inbetween tags

I have a regular expression that I use to grab data between two sets of id's for example <CLASSCOD>70</CLASSCOD> The regular expression I use is (?<=<CLASSCOD>)(?:[^<]|<(?!/CLASSCOD))* which works in most case but when i have a single value like this <CLASSCOD>N</CLASSCOD> it says there are no matches.

The whole data string looks like this

<DATE>0601</DATE>
<YEAR>11</YEAR>
<AGENCY>Department of the Interior</AGENCY>
<OFFICE>Bureau of Indian Affairs</OFFICE>
<LOCATION>BIA - DAPM</LOCATION>
<ZIP>85004</ZIP>
<CLASSCOD>N</CLASSCOD>
<OFFADD>Contracting Office - Western Region 2600 N. Central Avenue, 4th Floor Phoenix AZ 85004</OFFADD>
<SUBJECT>Boiler Replacement</SUBJECT>
<SOLNBR>A11PS00463</SOLNBR>
<RESPDATE>061711</RESPDATE>
<ARCHDATE>05312012</ARCHDATE>
<CONT开发者_如何学编程ACT>Geraldine M. Williams Purchasing Agent 6023794087 geraldine.williams@bia.gov;<a href="mailto:EC_helpdesk@NBC.GOV">Point of Contact above, or if none listed, contact the IDEAS EC HELP DESK for assistance</a>
</CONTACT>
<LINK><URL>https://www.fbo.gov/spg/DOI/BIA/RestonVA/A11PS00463/listing.html<LINKDESC>Link To Document</LINK>
<EMAIL></EMAIL>
<EMAIL>
  EC_helpdesk@NBC.GOV
  <EMAILDESC>
    Point of Contact above, or if none listed, contact the IDEAS EC HELP DESK for assistance
  </EMAILDESC>
</EMAIL>
<SETASIDE>Total Small Business</SETASIDE>
<POPCOUNTRY>USA</POPCOUNTRY>
<POPZIP>85634</POPZIP>
<POPADDRESS>BIE Tohono O'odham High School, Sells, AZ</POPADDRESS>

Any Suggestions as to the reason?

Thanks


Something simpler should work:

<CLASSCOD>(.+?)</CLASSCOD>

Example:

Match match = Regex.Match(input, @"<CLASSCOD>(.+?)</CLASSCOD>");
if (match.Success) {
    string value = match.Groups[1].Value;
    Console.WriteLine(value);
}


If you would like to extract the value inside the brackets you may use the following RegEx:

<([^>]+)>([^<]*)</\1>

For this scenario there is no need to use the lookahead and lookbehind operators.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜