开发者

Process Involved in PHP Screen Scraping

Can anyone tell me the process involved n PHP Screen Scraping of aspx page using POST Request? I wa开发者_Go百科nt to download the data from a website and save it to Database.


General steps, regardless of technology used on either side:

  1. Download the target page over HTTP (typically using libcurl)
  2. Parse the downloaded document
  3. Extract the parts that interest you
  4. Store extracted data in database.

That's pretty much all anyone could tell you based on the information given.


download the file :

$file = file_get_contents('http://www.google.com');

if its an xml file or json, break it into an array and then search for that value you would like using

$key = array_search('search term', $array);

this will return the key of the array you are looking for so it will be $array[$key]. but if its an html page you can easily use this function to search the downloaded page :

function extractStringFromString ($string, $start, $end) {

$startPos = strpos($string,$start);
$stringEndTagPos = strpos($string,$end,$startPos);
$stringBetween = substr($string,$startPos+strlen($start),$stringEndTagPos-$startPos-strlen($start));

if (strlen($stringBetween) != 0) {

    return $stringBetween;
    return true;
}
else {

    return false;
}

}

you can use this function like $returnString = extractStringFromString($file, '$start', '$end' , $start being the start of the thing you are looking for and use $end to end the search so if you have <div id="someID">here is some text</div> $start will be equal to <div id="someID"> and $end = '</div>' and the $returnString will equal "here is some text".

for DB, you need to connect to DB and then use a command like

INSERT INTO table_name (column1, column2, column3,...)
VALUES (value1, value2, value3,...)

let me know if you have any other questions

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜