parse a htmlpage injected as a String ? (String to XML)
I got this as a String to my prepareXml method
<TBODY>
<TR>
<TD colSpan=4>Detail of your Trip</TD></TR>
<TR></TR>
<TR>
<TD colSpan=4>Booking Ref. : XXX</TD></TR>
<TR></TR>
<TR>
<TD>Client</TD>
<TD colSpan=2>Ticket Number</TD>
<TD>FOID</TD></TR>
<TR>
<TD>Person (ADT)</TD>
<TD colSpan=2>000000</TD>
<TD>XXXX</TD></TR>
<TR></TR>
<TR>
<TD>From: Location 1</TD>
<TD>To : Location 2</TD>
<TD colSpan=2>Flight : LLL</TD></TR>
<TR>
<TD colSpan=2></TD>
<TD colSpan=2>Departure : 14Aug, 15:55 Latest check-in time limit : 15:25 </TD></TR>
<TR>
<TD colSpan=2></TD>
<TD colSpan=2>Arrival : 17:25</TD></TR>
<TR>
<TD colSpan=2></TD>
<TD colSpan=2>Class N</TD></TR>
<TR>
<TD>From : Location 2</TD>
<TD>To :Location1</TD>
<TD colSpan=2>Fli开发者_开发技巧ght : AF2585 Resa : OK</TD></TR>
<TR>
<TD colSpan=2></TD>
<TD colSpan=2>Departure : "Time" Latest check-in time limit : "Time" </TD></TR>
<TR>
<TD colSpan=2></TD>
<TR>
<TD colSpan=2></TD>
Class N
I have this as a string and i should parse it and send it as an xml
I want to get the Flight number Ticket number , and departure loc , arrival loc ..And also check whether it is one way or two way ..
How can i do that.. As it is really big what is the best way to parse this ?
Help appreciated .
You can parse the HTML using e.g., NekoHTML. Neko
is an open source parser/tag balancer which lets you use regular XML operations to traverse and extract information from your document. E.g.,
String html = ...
DOMParser parser = new DOMParser();
parser.parse(new InputSource(new ByteArrayInputStream(html.getBytes())));
Document = parser.getDocument(); // standard org.w3c.dom.Document
At this stage you could also hook it up to a XPATH
parser such as Jaxen to extract the desired information more conveniantly.
精彩评论