开发者

How can I extract multiple lines with regex in java?

If I have a bunch of text, let say HTML, but it doesnt have to be.

</TD> 
<TD CLASS='statusEven'><TABLE BORDER=0 WIDTH='100%' CELLSPACING=0 CELLPADDING=0><TR><TD         ALIGN=LEFT><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0> 
<TR> 
<TD ALIGN=LEFT valign=center CLASS='statusEven'><A HREF='extinfo.cgi?    type=2&host=localhost&service=Current+Load'>Current Load</A></TD></TR> 
</TABLE> 
</TD> 
<TD ALIGN=RIGHT CLASS='statusEven'> 
<TABLE BORDER=0 cellspacing=0 cellpadding=0> 
<TR> 
</TR> 
</TABLE> 
</TD> 
</TR></TABLE></TD> 
<TD CLASS='statusOK'>OK</TD> 
<TD CLASS='statusEven' nowrap>08-04-2011 22:07:00</TD> 
<TD CLASS='statusEven' nowrap>28d 13h 18m 11s</TD> 
<TD CLASS='statusEven'>1/1</TD> 
<TD CLASS='statusEven' valign='center'>OK &#45; load average&#58; 0&#46;01&#44; 0&#46;04&#44; 0&#46;05&nbsp;</TD> 

and I wanted to grab everything between 2 markers and the result is probably multiple lines, how would I do that?

Here's what I have so far....

    Pattern p = Pattern.compile("extinfo(.*)load average");
    Mat开发者_如何学运维cher m = p.matcher(this.resultHTML);

    if(m.find())
    {
         return m.group(1);
    }


Use the (?s) switch:

Pattern p = Pattern.compile("(?s)extinfo(.*?)load average")

This switch turns on "dot matches newline" for the remainder of the regular expression, which means essentially it treat the whole input a "one line" (newlines are just another character).

Without this switch, patterns won't match across a newline boundary.

Also, your regex was "greedy", so I added ? to the capture to make it "not greedy", which means it will capture enough to make the match, but no more.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜