Fastest way to calculate the size of an file opened inside the code (PHP)
I know there quite a bit of in-built functions available in PHP
to get size of the file, some of them are: filesize, stat, ftell, etc.
My question lies around ftell
which is quite interesting, it returns you the integer value of the 开发者_运维技巧file-pointer from the file.
Is it possible to get the size of the file using ftell
function? If yes, then tell me how?
Scenario:
- System (code) opens a existing file with mode "a" to append the contents.
- File pointer points to the end of line.
- System updates the content into the file.
- System uses
ftell
to calculate the size of the file.
fstat
determines the file size without any acrobatics:
$f = fopen('file', 'r+');
$stat = fstat($f);
$size = $stat['size'];
ftell
can not be used when the file has been opened with the append("a"
) flag. Also, you have to seek to the end of the file with fseek($f, 0, SEEK_END)
first.
ftell()
can tell you how many bytes are supposed to be in the file, but not how many actually are. Sparse files take up less space on disk than the value seeking to the end and telling will return.
I wrote a benchmark to improve this topic, and to avoid people arguing there's some kind of php/cache, I create unique files in another process.
This is a new benchmark I did to remain no doubt.
Tests ignore fopen and close time, since user asks the fastest way to calculate the size of an already opened file. Each test is run with 200 files.
The code which creates files in a separate process is the first comment of this post.
<?php
class timeIt
{
static private $times = [];
static function new()
{
self::$times[] = hrtime(true);
}
static function stop()
{
self::$times[] = -1;
}
static function dif()
{
$dif = 0;
$sum = 0;
$i = count(self::$times) - 1;
if (self::$times[$i] === -1)
unset(self::$times[$i--]);
for ($i = count(self::$times) - 1; $i > 0; --$i) {
if (self::$times[$i - 1] === -1) {
$sum += $dif;
$dif = 0;
--$i;
continue;
}
$dif += self::$times[$i] - self::$times[$i - 1];
}
return $sum + $dif;
}
static function printNReset()
{
echo "diffTime:" . self::dif() . "\n\n";
self::reset();
}
static function reset()
{
self::$times = [];
}
}
function fseek_size_from_current($handle)
{
$current = ftell($handle);
fseek($handle, 0, SEEK_END);
$size = ftell($handle);
fseek($handle, $current);
return $size;
}
function fseek_size_from_start($handle)
{
fseek($handle, 0, SEEK_END);
$size = ftell($handle);
fseek($handle, 0);
return $size;
}
function uniqueProcessId()
{
return (string) hrtime(true);
}
function getUniqueForeignProcessFiles($quantity, $size)
{
$returnedFilenames = $filenames = [];
while ($quantity--){
$filename = uniqueProcessId();
$filenames[$filename] = $size;
$returnedFilenames[] = __DIR__ . DIRECTORY_SEPARATOR . $filename;
}
$data = base64_encode(json_encode($filenames));
$foreignCgi = __DIR__ . DIRECTORY_SEPARATOR . "createFileByNames.php";
$command = "php $foreignCgi $data";
if (shell_exec($command) !== 'ok')
die("An error ocurred");
return $returnedFilenames;
}
const FILESIZE = 20 * 1024 * 1024;
foreach(getUniqueForeignProcessFiles(200, FILESIZE) as $filename){
$handle = fopen($filename, 'r');
timeIt::new();
$size = fstat($handle)['size'];
timeIt::new();
timeIt::stop();
fclose($handle);
unlink($filename);
}
echo "**fstat**\n";
timeIt::printNReset();
foreach(getUniqueForeignProcessFiles(200, FILESIZE) as $filename){
$handle = fopen($filename, 'r');
timeIt::new();
$size = fseek_size_from_start($handle);
timeIt::new();
timeIt::stop();
fclose($handle);
unlink($filename);
}
echo "**fseek with static/defined**\n";
timeIt::printNReset();
foreach(getUniqueForeignProcessFiles(200, FILESIZE) as $filename){
$handle = fopen($filename, 'r');
timeIt::new();
$size = fseek_size_from_current($handle);
timeIt::new();
timeIt::stop();
fclose($handle);
unlink($filename);
}
echo "**fseek with current offset**\n";
timeIt::printNReset();
foreach(getUniqueForeignProcessFiles(200, FILESIZE) as $filename){
$handle = fopen($filename, 'r');
timeIt::new();
$size = filesize($filename);
timeIt::new();
timeIt::stop();
fclose($handle);
unlink($filename);
}
echo "**filesize after fopen**\n";
timeIt::printNReset();
foreach(getUniqueForeignProcessFiles(200, FILESIZE) as $filename){
timeIt::new();
$size = filesize($filename);
timeIt::new();
timeIt::stop();
unlink($filename);
}
echo "**filesize no fopen**\n";
timeIt::printNReset();
Results with 20MB files, times in nanoseconds
fstat diffTime:2745700
fseek with static/defined diffTime:1267400
fseek with current offset diffTime:983500
filesize after fopen diffTime:283052500
filesize no fopen diffTime:4259203800
Results with 1MB file, times in nanoseconds:
fstat diffTime:1490400
fseek with static/defined diffTime:706800
fseek with current offset diffTime:837900
filesize after fopen diffTime:22763300
filesize no fopen diffTime:216512800
Previously this answer had another benchmark, which I removed the algorithm to let this answer cleaner. That algorithm used file created by own process and the assumption was:
ftell + fseek is half the time of fstat['size'], even inside another function and calling both functions twice. fstat is slower because it has a lot more information than just the file size, so if you need the other infos alongside your code, to check for changes, just stick to fstat.
Current benchmark shows that assumption to be valid, which is: **fseek + ftell++ is 2-2.8x faster than fstat for files of 1-20MB.
Feel free to run your benchmarks and share your results.
Thanks @Phihag, with your info on fseek
along with ftell
I am able to calculate the size in a much better way. See the code here: http://pastebin.com/7XCqu0WR
<?php
$fp = fopen("/tmp/temp.rock", "a+");
fwrite($fp, "This is the contents");
echo "Time taken to calculate the size by filesize function: ";
$t = microtime(true);
$ts1 = filesize("/tmp/temp.rock") . "\n";
echo microtime(true) - $t . "\n";
echo "Time taken to calculate the size by fstat function:";
$t = microtime(true);
$ts1 = fstat($fp) . "\n";
$size = $ts1["size"];
echo microtime(true) - $t . "\n";
echo "Time taken to calculate the size by fseek and ftell function: ";
$t = microtime(true);
fseek($fp, 0, SEEK_END);
$ts2 = ftell($fp) . "\n";
echo microtime(true) - $t . "\n";
fclose($fp);
/**
OUTPUT:
Time taken to calculate the size by filesize function:2.4080276489258E-5
Time taken to calculate the size by fstat function:2.9802322387695E-5
Time taken to calculate the size by fseek and ftell function:1.2874603271484E-5
*/
?>
精彩评论