开发者

How To Insert Links Scraped With DOM Into A MySQL Database? (or what am I doing wrong?)

I am putting together a php script that pulls html using curl, copies it into new pages and saves the page names. All that works, but I also want to collect the urls on the page and enter them into a database. From my research, it looks like DOM is the best way to do that. However I get "Error, insert query failed" when I include DOM in my code. Here is where I am getting the DOM code. I suspect this is a database issue.

DOM, PHP and MySQL are new to me, so any comments, pointers or suggestions would be helpful and appreciated.

Any comments on the overall approach, or suggestions of alternative, are also quite welcome. I am not entirely convinced that DOM is best for scraping urls from html.

<html>
<body>

<?
$urls=explode("\n", $_POST['url']);
$proxies=explode("\n", $_POST['proxy']);

for ( $counter = 0; $counter <= 6; $counter++) {
for ( $count = 0; $count <= 6; $count++) {

 $ch = curl_init();
 curl_setopt($ch, CURLOPT_URL,$urls[$counter]);
 curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
 curl_setopt($ch, CURLOPT_PROXY,$proxies[$count]);
 curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($ch, CURLOPT_CUSTOMREQUEST,'GET');
 curl_setopt ($ch, CURLOPT_HEADER, 1); 
curl_exec ($ch); 
$curl_scraped_page = curl_exec($ch); 

$FileName = rand(0,100000000000);
$FileHandle = fo开发者_运维知识库pen($FileName, 'w') or die("can't open file");
fwrite($FileHandle, $curl_scraped_page);


$dom = new DOMDocument();
@$dom->loadHTML($curl_scraped_page);
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");

$hostname="****";
$username="****";
$password="****";
$dbname="leadturtle";
$usertable="happyturtle";

$con=mysql_connect($hostname,$username, $password) or die ("<html><script language='JavaScript'>alert('Unable to connect to database! Please try again later.'),history.go(-1)</script></html>");
mysql_select_db($dbname ,$con);



function storeLink($url) {
    $query = "INSERT INTO happyturtle (time, ad1, ad2) VALUES ('$FileName','$url', '$gathered_from')";
    mysql_query($query) or die('Error, insert query failed');
}
for ($i = 0; $i < $hrefs->length; $i++) {
    $href = $hrefs->item($i);
    $url = $href->getAttribute('href');
    storeLink($url,$target_url);

}


mysql_close($con);

fclose($FileHandle);

curl_close($ch);

echo $FileName; 

echo "<br/>";

}
}

?>

</body>
</html>


You are not escaping the values in your SQL query.

If your strings parameters contain a ' it'll will lead to syntax error (best case). But it can also lead to source injection and big security hole (http://xkcd.com/327/ :)!

First check your input.

Please add hte error message in your question.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜