Advanced PHP: Configure an onBefore and/or onAfter callback for a cURL handle?
I'm working with the cURL implementation in PHP and leveraging curl_multi_init() and curl_multi_exec() to execute batches of requests in parallel. I've been doi开发者_JAVA百科ng this for a while, and understand this piece of it.
However, the request bodies contain a signature that is calculated with a timestamp. From the moment this signature is generated, I have a limited window of time to make the request before the server will reject the request once it's made. Most of the time this is fine. However, in some cases, I need to do large uploads (5+ GB).
If I batch requests into a pool of 100, 200, 1000, 20000, or anything in-between, and I'm uploading large amounts of data to the server, the initial requests that execute will complete successfully. Later requests, however, won't have started until after the timestamp in the signature expires, so the server rejects those requests out-of-hand.
The current flow I'm using goes something like this:
- Do any processing ahead of time.
- Add the not-yet-executed cURL handles to the batch.
- Let cURL handle executing all of the requests.
- Look at the data that came back and parse it all.
I'm interested in finding a way to execute a callback function that can generate a signature on-demand and update the request body at the moment that PHP/cURL goes to execute that particular request. I know that you can bind a callback function to a cURL handle that will execute repeatedly while the request is happening, and you have access to the cURL handle all along the way.
So my question is this: Is there any way to configure an onBefore and/or onAfter callback for a cURL handle? Something that can execute immediately before the cURL executes the request, and then something that can execute immediately after the response comes back so that the response data can be parsed.
I'd like to do something a bit more event oriented, like so:
- Add a not-yet-executed cURL handle to the batch, assigning a callback function to execute when cURL (not myself) executes the request (both before and after).
- Take the results of the batch request and do whatever I want with the data.
No, this isn't possible with the built in functions of cURL. However, it would be trivial to implement a wrapper around the native functions to do what you want.
For instance, vaguely implementing the Observer pattern:
<?php
class CurlWrapper {
private $ch;
private $listeners;
public function __construct($url) {
$this->ch = curl_init($url);
$this->setopt(CURLOPT_RETURNTRANSFER, true);
}
public function setopt($opt, $value) {
$this->notify('setopt', array('option' => $opt, 'value' => $value));
curl_setopt($this->ch, $opt, $value);
}
public function setopt_array($opts) {
$this->notify('setopt_array', array('options' => $opts));
curl_setopt_array($this->ch, $opts);
}
public function exec() {
$this->notify('beforeExec', array());
$ret = curl_exec($this->ch);
$this->notify('afterExec', array('result' => $ret));
return $ret;
}
public function attachListener($event, $fn) {
if (is_callable($fn)) {
$this->listeners[$event][] = $fn;
}
}
private function notify($event, $data) {
if (isset($this->listeners[$event])) {
foreach ($this->listeners[$event] as $listener) {
$listener($this, $data);
}
}
}
}
$c = new CurlWrapper('http://stackoverflow.com');
$c->setopt(CURLOPT_HTTPGET, true);
$c->attachListener('beforeExec', function($handle, $data) {
echo "before exec\n";
});
$result = $c->exec();
echo strlen($result), "\n";
You can add event listeners (which must be callables) to the object with addListener
, and they will automatically be called at the relevant moment.
Obviously you would need to do some more work to this to make it fit your requirements, but it isn't a bad start, I think.
Anything to do with cURL is not advanced PHP. It's "advanced mucking about".
If you have these huge volumes of data going through cURL I would recommend not using cURL at all (actually, I would always recommend not using cURL)
I'd look into a socket implementation. Good ones aren't easy to find, but not that hard to write yourself.
Ok, so you say that the requests are parallelized, I'm not sure exactly what that means, but that's not too important.
As an aside, I'll explain what I mean by Asynchronous. If you open a raw TCP socket, you can call the socket_set_blocking function on the connection, this means that read / write operations don't block. You can take several of these connections and write a small amount of data to each of them in a loop, this way you are sending your requests "at once".
The reason I asked whether you have to wait until the whole message is consumed before the endpoint validates the signature is that even if Curl is sending the requests "all at once", there's always a possibility that the time it takes to upload will mean that the validation fails. Presumably it's slower to upload 2000 requests at once than to upload 5, so you'd expect more failures for the former case? Similarly, if your requests are processing synchronously (i.e. one at a time) then you'll see the same error for the same reason, although in this case it's the later requests that are expected to fail. Maybe you need to think about the data upload rate required to upload a message of a particular size within a particular time frame, then try and calculate an optimum multi-payload size. Perhaps the best approach is the simplest: upload one at a time and calculate the signature just before each upload?
A better approach might be to put the signature in a message header, this way the signature can be read earlier in the upload process.
精彩评论