开发者

PHP cURL 'Fatal error: Allowed memory size' for large data sets

I know about the option to set the internal memory

ini_set("memory_limit","30M");

But I wanted to know if there is a better approach for querying data?

I have a WHILE LOOP that checks to see if I need to query for another 1000 records. using the offset as the starting record number and the limit as the returned records, I search for all records matching my data request. I hit about 100K in records before I get the error.

Now during testing I found that I get the 'Fatal error: Allowed memory size...' error. I've read by setting the above ini_set() to allow for the increase in memory but I wanted to know if I could just code it better?

Each time I execute the code below in the WHILE LOOP, the memory usage grows very large. Even if I unset($curl). I think it could be reduced if I could unset the $result and $curl variables after I have parsed out the results before the next cURL query.

function getRequest($url,$user,$pwd) {

    $curl = curl_init();

    curl_setopt($curl, CURLOPT_VERBOSE, 1);
    curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 2);
    curl_setopt($curl, CURLOPT_HEADER, 0);
    curl_setopt($curl, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curl, CURLOPT_USERPWD, "$user:$pwd");
    curl_setopt($curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
    curl_setopt($curl, CURLOPT_ENCODING, '');
    curl_setopt($curl, CURLOPT_URL, $url);

    $result = curl_exec($curl);

    $httpResponseCode = (int)curl_getinfo($curl, CURLINFO_HTTP_CODE);

    switch ($httpResponseCode) {
        case 500:
            // Send problem email
            break;
        case 200:
            // GET was good
            break;
        default:
            // Send problem email
            break;
    }  开发者_JAVA技巧  
    curl_close($curl);
    return $result;
} 

WHILE LOOP (Slim version)

while($queryFlag) { // $queryFlag is TRUE

        // Check if we have more records to query, if not set $queryFlag to FALSE

        // Build cURL URL

        echo "Before Call Memory Usage: ".memory_get_usage()."\n";
        $resultXML  = getRequest($query,$user,$pass);
        echo "After Call Memory Usage: ".memory_get_usage()."\n";

        $results        = new ParseXMLConfig((string)$resultXML); // This is basically a class for $this->xml = simplexml_load_string($xml);

        // Loop through results and keep what  I'm looking for
        foreach($results as $resultsKey => $resultsData) {
            if(preg_match('|^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$|i', $resultsData)) {
                $resultsArr["$resultsData"] = $resultsData;
            }
        }

    }

Some memory numbers

  • Before Call Memory Usage: 1819736
  • After Call Memory Usage: 2285344
  • keep data I need
  • dump data I don't need
  • Next LOOP Iteration
  • Before Call Memory Usage: 2084128
  • After Call Memory Usage: 2574952


I guess you are using incorrect key for $resultsArr. You are using same string as both key and value.

Try changing

$resultsArr["$resultsData"] = $resultsData

to

$resultsArr[$resultsKey] = $resultsData


Settled for:

ini_set("memory_limit","30M");


This is in response to OP's question, "can I just code better?". You're asking the right question! Good on ya.

I had an issue similar to this, where I had a loop and something in the loop was chewing up memory. What I discovered was that, I think, because I was saving the data I wanted to 'keep' to an object (stdClass), that's what was slowing things down. I changed it so instead of something like:

$payloadObject->someNewAttribute = $data_i_want_to_keep;

I used an indexed array:

$payload_array[] = $data_i_want_to_keep;

That worked out for me.

I saw a test somewhere (and tried it out myself and found my results in agreement with what I saw) for getting / setting on a stdClass object, an associative array, and an indexed array. I forget if it was speed or memory, but setting on stdClass uses more memory / is slower than setting an associative array key / value, which is slower / uses more memory than just adding a new indexed array item. I.e.:

//  slowest / most memory intensive
$some_stdClass_object->some_new_attribute = $data_I_want_to_keep;

//  slower / still memory intensive
$some_array["some_new_key"] = $data_I_want_to_keep;

//  optimal / fastest
$some_array[] = $data_I_want_to_keep;

I noticed this line, after looking your code over again (comment moved for readability):

// This is basically a class for $this->xml = simplexml_load_string($xml);
$results        = new ParseXMLConfig((string)$resultXML);

It looks like you're setting the xml to an attribute on a class (whatever class $this is). That could very well be what's eating your memory. Maybe look at that function (ParseXMLConfig) and see if you can change saving that data to the object, to saving it in maybe an indexed array or an associative array if you need.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜