Parsing html code with html error problem
I want to parse the link: http://dizli.com/dizli/db.html using php.
But when i wrote the code,
$url = "http://dizli.com/dizli/db.html";
$dom = new DOMDocument();
$html = $dom->loadHTMLFile($url);
$dom->preserveWhiteSpace = false;
$tables = $dom->getElementsByTagName('table');
$tr = $tables->item(2)->getElementsByTagName('tr');
$rows = $tables->item(0)->getElementsByTagName('td');
foreach($rows as $row)
{
$movie = $row->getElementsByTagName('b');
echo $movie;}
I got bunch of errors:
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: font and td in http://dizli.com/dizli/db.html, line: 54 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: font and b in http://dizli.com/dizli/db.html, line: 81 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: font and b in http://dizli.com/dizli/db.html, line: 106 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: htmlParseEntityRef: no name in http://dizli.com/dizli/db.html, line: 115 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: td and b in http://dizli.com/dizli/db.html, line: 126 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: td and font in http://dizli.com/dizli/db.html, line: 126 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: font and b in http://dizli.com/dizli/db.html, line: 128 in C:\development\app_server\开发者_C百科C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: htmlParseEntityRef: no name in http://dizli.com/dizli/db.html, line: 1575 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Tag blink invalid in http://dizli.com/dizli/db.html, line: 2190 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: td and b in http://dizli.com/dizli/db.html, line: 2200 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: td and font in http://dizli.com/dizli/db.html, line: 2200 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: body and center in http://dizli.com/dizli/db.html, line: 2225 in C:\development\app_server\C7\Lib\Tools\News.php on line 93
Catchable fatal error: Object of class DOMNodeList could not be converted to string in C:\development\app_server\C7\Lib\Tools\News.php on line 102
Can someone help me parse this link, so that I can save the Movie's names and Director's name.
Thanks in advance. Zeeshan
To hide the errors and still work with that code, just ad @
before $dom
, like:
$html = @$dom->loadHTMLFile($url);
The page is written in very old HTML code (you can tell by the FONT tags, capitalization, etc.) and so <br> tags and probably paragraphs and other things as well, are not closed. I recommend using regular expressions to find them in this case.
Your main problem is the last line:
echo $movie;
$movie
is an instance of DOMNodeList
so you can´t just echo it, you need to get it´s elements like for example $movie->item(0)
You can also just do a var_dump
of $movie
and see what that gets you.
The warnings you can possibly ignore, that depends on the output you get.
精彩评论