How can I extract multiple lines with regex in java?
If I have a bunch of text, let say HTML, but it doesnt have to be.
</TD>
<TD CLASS='statusEven'><TABLE BORDER=0 WIDTH='100%' CELLSPACING=0 CELLPADDING=0><TR><TD ALIGN=LEFT><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
<TR>
<TD ALIGN=LEFT valign=center CLASS='statusEven'><A HREF='extinfo.cgi? type=2&host=localhost&service=Current+Load'>Current Load</A></TD></TR>
</TABLE>
</TD>
<TD ALIGN=RIGHT CLASS='statusEven'>
<TABLE BORDER=0 cellspacing=0 cellpadding=0>
<TR>
</TR>
</TABLE>
</TD>
</TR></TABLE></TD>
<TD CLASS='statusOK'>OK</TD>
<TD CLASS='statusEven' nowrap>08-04-2011 22:07:00</TD>
<TD CLASS='statusEven' nowrap>28d 13h 18m 11s</TD>
<TD CLASS='statusEven'>1/1</TD>
<TD CLASS='statusEven' valign='center'>OK - load average: 0.01, 0.04, 0.05 </TD>
and I wanted to grab everything between 2 markers and the result is probably multiple lines, how would I do that?
Here's what I have so far....
Pattern p = Pattern.compile("extinfo(.*)load average");
Mat开发者_如何学运维cher m = p.matcher(this.resultHTML);
if(m.find())
{
return m.group(1);
}
Use the (?s)
switch:
Pattern p = Pattern.compile("(?s)extinfo(.*?)load average")
This switch turns on "dot matches newline" for the remainder of the regular expression, which means essentially it treat the whole input a "one line" (newlines are just another character).
Without this switch, patterns won't match across a newline boundary.
Also, your regex was "greedy", so I added ?
to the capture to make it "not greedy", which means it will capture enough to make the match, but no more.
精彩评论