开发者

PHP: exit and show alert when looping a very long function is not responding

I have this big function (1300+ lines of code) that takes data from the web and insert it into a local database. Each time the function runs its takes something like 20 seconds to complete and I need to run this function like a million times, so I use set_time_limit(0) to set the PHP time limit to infinite and I loop the function a million times, like this:

for ($ID= '01'; $ID < '999999'; $ID++) {
    getDataFromWeb($conn, $ID);
}

So whats the problem? The problem is that there are a million things that can go wrong and it always does go wrong, and suddenly the code gets stuck in ID 23465 for example, and it just stop getting data but I don't get any kind of error, its like the loop continues but without inserting anything to database, and because of the 'no time limit' I set to PHP then it never stops.

I want to know how I can detect this kind of problem, stop all and show alert. If a I set the time before the function starts and then check it when the function ends, like this:

for ($ID= '01'; $ID < '999999'; $ID++) {
    $time_start = microtime();
    getDataFromWeb($conn, $ID);
    $time_end = microtime();
    if ($time_alert - //... somehow check how time does it takes and stop if its taking too much
}

It will not wor开发者_StackOverflow中文版k because if the function never completes then $time_end will never be set and so on...

So, help please?


I'd try http://www.php.net/manual/en/function.set-time-limit.php#92949.


Side note: The supplied code will not loop 1,000,000 times. The following will:

for( $id=1 ; $id<=1000000 ; $id++ ) {
    getDataFromWeb( $conn , $id );
}

Also, with regards to your need to have this script run constantly to load content into a database, I would suggest the following:

  • I presume that you are using an SQL Table to hold the URLs to be crawled,
  • Add a field with a timestamp called 'loadAttempted',
  • Limit the PHP Script to try and perform the action to maybe 5 times,
  • Record the Time the Script attempt to crawl the URL into the 'loadAttempted' field,
  • Have each loop of the Script perform a search for any URLs where 'loadAttempted' is empty, or where it is greater than X minutes ago,
  • Add a CRON Job to trigger the Script

This would mean that, up to every minute, the script will be triggered and will try and load 5 URLs. If a URL takes an abnormally long period of time to load (which would mean that the script timed out whilst trying to crawl it) it will cycle back around and be tried again.

You could also use this, or variants on the idea, to get stats for pages which are slower than the rest and/or the average loadtime for the URLs.

Also, if you are wanting to have this running constantly, I would suggest that limiting the PHP script to try and run the getDataFromWeb() function a smaller number of times (like 5)


If getDataFromWeb($conn, $ID); uses libs like libcurl or similar, than maybe it's a good idea to set connection timelimit there? Or for debug just echo '.' to know that function've been finished and exited.


Okay - there are several things here that are red flags in my mind.

First - You weren't kidding when you said you were looping this 1 million times. That surprised me.

Second - This loop looks weird to me:

for ($ID= '01'; $ID < '999999'; $ID++)

Why not instead do:

for ($ID = 1; $ID < 999999; $ID++)

I don't see why you're using Strings for Integer counting.

Third - How are you executing this? Is it from a browser or from CLI

Lastly - Without seeing the code it's hard to say what's going on, but does the function return a true/false boolean when complete, or are their other triggers like echo statements (at the minimum) in the function that will print debug information so you can track the progress.

You may want to simplify the code in the getDataFromWeb function it sounds like it's running some kind of cURL request, parsing that data, and placing it into the "$conn" database. Might be easier to not only understand but read if you chunked specific tasks from that function into separate functions (Or made a class) One for getting the data, one for "cleaning" the data, and one for entering the data into the database. If a function has too many tasks then issues like this (Debugging) become a nightmare.


Do you have any mysql_error()/mysql_errno() functions in your getDataFromWeb() function? Such as

if(mysql_errno($conn))
{ 
  echo mysql_errno($conn) . ": " . mysql_error($conn);
}

From http://php.net/manual/en/function.mysql-error.php

To stop the function replace the echo with die.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜