Process Involved in PHP Screen Scraping
Can anyone tell me the process involved n PHP Screen Scraping of aspx page using POST Request? I wa开发者_Go百科nt to download the data from a website and save it to Database.
General steps, regardless of technology used on either side:
- Download the target page over HTTP (typically using libcurl)
- Parse the downloaded document
- Extract the parts that interest you
- Store extracted data in database.
That's pretty much all anyone could tell you based on the information given.
download the file :
$file = file_get_contents('http://www.google.com');
if its an xml file or json, break it into an array and then search for that value you would like using
$key = array_search('search term', $array);
this will return the key of the array you are looking for so it will be $array[$key]. but if its an html page you can easily use this function to search the downloaded page :
function extractStringFromString ($string, $start, $end) {
$startPos = strpos($string,$start);
$stringEndTagPos = strpos($string,$end,$startPos);
$stringBetween = substr($string,$startPos+strlen($start),$stringEndTagPos-$startPos-strlen($start));
if (strlen($stringBetween) != 0) {
return $stringBetween;
return true;
}
else {
return false;
}
}
you can use this function like $returnString = extractStringFromString($file, '$start', '$end'
, $start being the start of the thing you are looking for and use $end to end the search so if you have <div id="someID">here is some text</div>
$start
will be equal to <div id="someID">
and $end = '</div>'
and the $returnString
will equal "here is some text".
for DB, you need to connect to DB and then use a command like
INSERT INTO table_name (column1, column2, column3,...)
VALUES (value1, value2, value3,...)
let me know if you have any other questions
精彩评论