Using zlib filter with a socket pair
For some reason, the zlib.deflate
filter doesn't seem to be working with socket pairs generated by stream_socket_pair()
. All that can be read from the second socket is the two-byte zlib header, and everything after that is NULL.
Example:
<?php
list($in, $out) = stream_socket_pair(STREAM_PF_UNIX,
STREAM_SOCK_STREAM,
STREAM_IPPROTO_IP);
$params = array('level' => 6, 'window' => 15, 'memory' => 9);
stream_filter_append($in, 'zlib.deflate', STREAM_FILTER_WRITE, $params);
stream_set_blocking($in, 0);
stream_set_blocking($out, 0);
fwrite($in, 'Some big long string.');
$compressed = fread($out, 1024);
var_dump($compressed);
fwrite($in, 'Some big long string, take two.');
$compressed = fread($out, 1024);
var_dump($compressed);
fwrite($in, 'Some big long string - third time is the charm?');
$compressed = fread($out, 1024);
var_dump($compressed);
Output:
string(2) "x�"
string(0) ""
string(0) ""
If I comment out the call to stream_filter_append()
, the stream writing/reading functions correctly, with the data being dumped in its entirety all three times, and if I direct the zlib filtered 开发者_C百科stream into a file instead of through the socket pair, the compressed data is written correctly. So both parts function correctly separately, but not together. Is this a PHP bug that I should report, or an error on my part?
This question is branched from a solution to this related question.
I had worked on the PHP source code and found a fix.
To understand what happens I had traced the code during a
....
for ($i = 0 ; $i < 3 ; $i++) {
fwrite($s[0], ...);
fread($s[1], ...);
fflush($s[0], ...);
fread($s[1], ...);
}
loop and I found that the deflate
function is never called with the Z_SYNC_FLUSH
flag set because no new data are present into the backets_in
brigade.
My fix is to manage the (PSFS_FLAG_FLUSH_INC
flag is set AND
no iterations are performed on deflate function case) extending the
if (flags & PSFS_FLAG_FLUSH_CLOSE) {
managing FLUSH_INC
too:
if (flags & PSFS_FLAG_FLUSH_CLOSE || (flags & PSFS_FLAG_FLUSH_INC && to_be_flushed)) {
This downloadable patch is for debian squeeze
version of PHP but the current git version of the file is closer to it so I suppose to port the fix is simply (few lines).
If some side effect arises please contact me.
Looking through the C source code, the problem is that the filter always lets zlib's deflate()
function decide how much data to accumulate before producing compressed output. The deflate filter does not create a new data bucket to pass on unless deflate()
outputs some data (see line 235) or the PSFS_FLAG_FLUSH_CLOSE
flag bit is set (line 250). That's why you only see the header bytes until you close $in
; the first call to deflate()
outputs the two header bytes, so data->strm.avail_out
is 2 and a new bucket is created for these two bytes to pass on.
Note that fflush()
does not work because of a known issue with the zlib filter. See: Bug #48725 Support for flushing in zlib stream.
Unfortunately, there does not appear to be a nice work-around to this. I started writing a filter in PHP by extending php_user_filter
, but quickly ran into the problem that php_user_filter
does not expose the flag bits, only whether flags & PSFS_FLAG_FLUSH_CLOSE
(the fourth parameter to the filter()
method, a boolean argument commonly named $closing
). You would need to modify the C sources yourself to fix Bug #48725. Alternatively, re-write it.
Personally I would consider re-writing it because there seems to be a few eyebrow-raising issues with the code:
status = deflate(&(data->strm), flags & PSFS_FLAG_FLUSH_CLOSE ? Z_FULL_FLUSH : (flags & PSFS_FLAG_FLUSH_INC ? Z_SYNC_FLUSH : Z_NO_FLUSH));
seems odd because when writing, I don't know whyflags
would be anything other thanPSFS_FLAG_NORMAL
. Is it possible to write & flush at the same time? In any case, handling the flags should be done outside of thewhile
loop through the "in" bucket brigade, like howPSFS_FLAG_FLUSH_CLOSE
is handled outside of this loop.Line 221, the
memcpy
todata->strm.next_in
seems to ignore the fact thatdata->strm.avail_in
may be non-zero, so the compressed output might skip some data of a write. See, for example, the following text from the zlib manual:If not all input can be processed (because there is not enough room in the output buffer),
next_in
andavail_in
are updated and processing will resume at this point for the next call ofdeflate()
.In other words, it is possible that
avail_in
is non-zero.- The
if
statement on line 235,if (data->strm.avail_out < data->outbuf_len)
should probably beif (data->strm.avail_out)
or perhapsif (data->strm.avail_out > 2)
. I'm not sure why*bytes_consumed = consumed;
isn't*bytes_consumed += consumed;
. The example streams at http://www.php.net/manual/en/function.stream-filter-register.php all use+=
to update$consumed
.
EDIT: *bytes_consumed = consumed;
is correct. The standard filter implementations all use =
rather than +=
to update the size_t
value pointed to by the fifth parameter. Also, even though $consumed += ...
on the PHP side effectively translates to +=
on the size_t
(see lines 206 and 231 of ext/standard/user_filters.c
), the native filter function is called with either a NULL
pointer or a pointer to a size_t
set to 0 for the fifth argument (see lines 361 and 452 of main/streams/filter.c
).
You need to close the stream after the write to flush it before the data will come in from the read.
list($in, $out) = stream_socket_pair(STREAM_PF_UNIX,
STREAM_SOCK_STREAM,
STREAM_IPPROTO_IP);
$params = array('level' => 6, 'window' => 15, 'memory' => 9);
stream_filter_append($out, 'zlib.deflate', STREAM_FILTER_WRITE, $params);
stream_set_blocking($out, 0);
stream_set_blocking($in, 0);
fwrite($out, 'Some big long string.');
fclose($out);
$compressed = fread($in, 1024);
echo "Compressed:" . bin2hex($compressed) . "<br>\n";
list($in, $out) = stream_socket_pair(STREAM_PF_UNIX,
STREAM_SOCK_STREAM,
STREAM_IPPROTO_IP);
$params = array('level' => 6, 'window' => 15, 'memory' => 9);
stream_filter_append($out, 'zlib.deflate', STREAM_FILTER_WRITE, $params);
stream_set_blocking($out, 0);
stream_set_blocking($in, 0);
fwrite($out, 'Some big long string, take two.');
fclose($out);
$compressed = fread($in, 1024);
echo "Compressed:" . bin2hex($compressed) . "<br>\n";
list($in, $out) = stream_socket_pair(STREAM_PF_UNIX,
STREAM_SOCK_STREAM,
STREAM_IPPROTO_IP);
$params = array('level' => 6, 'window' => 15, 'memory' => 9);
stream_filter_append($out, 'zlib.deflate', STREAM_FILTER_WRITE, $params);
stream_set_blocking($out, 0);
stream_set_blocking($in, 0);
fwrite($out, 'Some big long string - third time is the charm?');
fclose($out);
$compressed = fread($in, 1024);
echo "Compressed:" . bin2hex($compressed) . "<br>\n";
That produces: Compressed:789c0bcecf4d5548ca4c57c8c9cf4b57282e29cacc4bd70300532b079c Compressed:789c0bcecf4d5548ca4c57c8c9cf4b57282e29cacc4bd7512849cc4e552829cfd70300b1b50b07 Compressed:789c0bcecf4d5548ca4c57c8c9cf4b57282e29ca0452ba0a25199945290a259940c9cc62202f55213923b128d71e008e4c108c
Also I switched the $in and $out because writing to $in confused me.
精彩评论