PHP file writing optimization
EDIT: Optimization results at end of this question!
hi, i have a following code to first scan files in a specific folder and then read every file line by line and after numerous "if...else if" write new modified file to another folder with the name name as it was when opened.
The problem is that writing a file line by line seems to be awfully slooooow. The default 60 seconds limit will only be enough for 25, or so, files. File sizes vary from 10k to 350k.
Any way to optimize code to make it running faster. Is it better to read line by lines, put every lines into an array and then write that whole array into a new text file (vs. line by line reading/writing). If it is, how it is done in practice.
thanks in advance ----- The code follows -----
<?php
function scandir_recursive($path) {
...
...
}
$fileselection = scandir_recursive('HH_new');
foreach ($fileselection as $extractedArray) {
$tableName = basename($extractedArray); // Table name
$fileLines=file($extractedArray);
foreach ($fileLines as $line) {
if(preg_match('/\(all-in\)/i' , $line)) {
$line = stristr($line, ' (all-in)', true) .', and is all in';
$allin = ', and is all in';
}
else {
$allin = '';
}
if(preg_match('/posts the small blind of \$[\d\.]+/i' , $line)) {
$player = stristr($line, ' posts ', true);
$betValue = substr(stristr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue;
}
else if(preg_match('/posts the big blind of \$[\d\.]+/i' , $line)) {
$player = stristr($line, ' posts ', true);
$betValu开发者_JAVA技巧e = substr(stristr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue;
}
else if(preg_match('/\S+ raises /i' , $line)) {
$player = stristr($line, ' raises ', true);
$betValue = substr(strstr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue; //total bet this hand (shortcut)
}
else if(preg_match('/\S+ bets /i' , $line)) {
$player = stristr($line, ' bets ', true);
$betValue = substr(strstr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue; //total bet this hand (shortcut)
}
else if(preg_match('/\S+ calls /i' , $line)) {
$player = stristr($line, ' calls ', true);
$betValue = substr(stristr($line, '$'), 1);
$callValue = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount called
$bettingMatrix[$player]['betTotal'] = $betValue;
$line = stristr($line, '$', true)."\$".$callValue.$allin;
$allin = '';
}
else if(preg_match('/(\*\*\* (Flop|Turn|River))|(Full Tilt Poker)/i' , $line)) {
unset($bettingMatrix); //zero $betValue
}
else if(preg_match('/\*\*\* FLOP \*\*\*/i' , $line)) {
$flop = substr(stristr($line, '['), 0, -2);
$line = '*** FLOP *** '. $flop;
}
else if(preg_match('/\*\*\* TURN \*\*\*/i' , $line)) {
$turn = substr(stristr($line, '['), 0, -2);
$line = '*** TURN *** '. $flop .' '. $turn;
}
else if(preg_match('/\*\*\* RIVER \*\*\*/i' , $line)) {
$river = substr(stristr($line, '['), 0, -2);
$line = '*** RIVER *** '. substr($flop, 0, -1) .' '. substr($turn, 1) .' '. $river;
}
else {
}
$ourFileHandle = fopen("HH_newest/".$tableName.".txt", 'a') or die("can't open file");
fwrite($ourFileHandle, $line);
fclose($ourFileHandle);
}
}
?>
EDIT: Here's VERY interesting results after rewriting the code based on tips everyone here gave me.
60 text files, 5.8MB total
After all optimization (changed preg->strpos/strstr & $handle before loop): 4 sec.
As above BUT changed strpos/strstr -> stripos/stristr: 8 sec.
As above BUT changed stripos/stristr -> preg: 12 sec.
As above BUT changed fopen inside the loop: 45/60 files after 180sec run limit
Here's the complete script:
$fileselection = scandir_recursive('HH_new');
foreach ($fileselection as $extractedArray) {
$tableName = basename($extractedArray); // Table name
$handle = fopen($extractedArray, 'r');
$ourFileHandle = fopen("HH_newest/".$tableName.".txt", 'a') or die("can't open file");
while ($line = fgets($handle)) {
if (FALSE !== strpos($line, '(all-in)')) {
$line = strstr($line, ' (all-in)', true) .", and is all in\r\n";
$allin = ', and is all in';
} else {
$allin = '';
}
if (FALSE !== strpos($line, ' posts the small blind of $')) {
$player = strstr($line, ' posts ', true);
$betValue = substr(strstr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue;
}
else if (FALSE !== strpos($line, ' posts the big blind of $')) {
$player = strstr($line, ' posts ', true);
$betValue = substr(strstr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue;
}
else if (FALSE !== strpos($line, ' posts $')) {
$player = strstr($line, ' posts ', true);
$betValue = substr(strstr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] += $betValue;
}
else if (FALSE !== strpos($line, ' raises to $')) {
$player = strstr($line, ' raises ', true);
$betValue = substr(strstr($line, '$'), 1);
$betMade = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount raised by
$bettingMatrix[$player]['betTotal'] = $betValue; //$line contains total bet this hand (shortcut)
}
else if (FALSE !== strpos($line, ' bets $')) {
$player = strstr($line, ' bets ', true);
$betValue = substr(strstr($line, '$'), 1);
$betMade = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount raised by
$bettingMatrix[$player]['betTotal'] = $betValue; //$line contains total bet this hand (shortcut)
}
else if (FALSE !== strpos($line, ' calls $')) {
$player = strstr($line, ' calls ', true);
$betValue = substr(strstr($line, '$'), 1);
$callValue = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount called
$bettingMatrix[$player]['betTotal'] = $betValue;
$line = strstr($line, '$', true)."\$".$callValue.$allin. "\r\n";
$allin = '';
}
else if (FALSE !== strpos($line, '*** FLOP ***')) {
$flop = substr(strstr($line, '['), 0, -2);
unset($bettingMatrix); //zero $betValue
}
else if (FALSE !== strpos($line, '*** TURN ***')) {
$turn = substr(strstr($line, '['), 0, -2);
$line = '*** TURN *** '.$flop.' '.$turn."\r\n";
unset($bettingMatrix); //zero $betValue
}
else if (FALSE !== strpos($line, '*** RIVER ***')) {
$river = substr(strstr($line, '['), 0, -2);
$line = '*** RIVER *** '. substr($flop, 0, -1) .' '. substr($turn, 1) .' '. $river."\r\n";
unset($bettingMatrix); //zero $betValue
}
else if (FALSE !== strpos($line, 'Full Tilt Poker')) {
unset($bettingMatrix); //zero $betValue
}
else {
}
fwrite($ourFileHandle, $line);
}
fclose($handle);
fclose($ourFileHandle);
}
i think this is because you're opening/closing file within the loop, try moving fopen() before foreach and fclose after it
I doubt the file writing is the performance issue here. You're running ten regular expressions on everything!
Using string methods like strpos to find the sub-strings might speed things up.
Doing away with the regular expression would give you the most performance increase, if you can change them to strpos() or similar - stripos() for case insensitive - you should notice a speed increase.
The test needs to be '!== false'
, since the found string may be at position 0. For example, your first test case could be ():
if(stripos($line, '(all-in)') !== false) {
//generate output
}
You also may find using fgets() instead of reading the whole file at one time may give you some performance increase (but that's more a memory issue). And as mentioned by others, only write to the file in the loop, don't open and close it.
Here's your code with a few tiny changes that should help quite a bit
- Switched from
file()
tofgets()
. This will load only a single line at a time into memory instead of every line from the file. - Changed your calls to
preg_match()
tostripos()
where applicable. Should be a tiny bit faster - Moved the opening/closing of
$ourFileHandle
into the outer loop. This will significantly reduce the number of stat calls to the filesystem and should speed it up greatly.
There are probably a lot of other optimizations that can be made in that monstrous if..else but i'll leave those up to another SOer (or you)
$fileselection = scandir_recursive('HH_new');
foreach ($fileselection as $extractedArray)
{
$tableName = basename( $extractedArray ); // Table name
$handle = fopen( $extractedArray, 'r' );
$ourFileHandle = fopen("HH_newest/".$tableName.".txt", 'a') or die("can't open file");
while ( $line = fgets( $handle ) )
{
if ( false !== stripos( $line, '(all-in)' ) )
{
$line = stristr($line, ' (all-in)', true) .', and is all in';
$allin = ', and is all in';
} else {
$allin = '';
}
if ( preg_match('/posts the small blind of \$[\d\.]+/i' , $line ) )
{
$player = stristr($line, ' posts ', true);
$betValue = substr(stristr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue;
}
else if(preg_match('/posts the big blind of \$[\d\.]+/i' , $line)) {
$player = stristr($line, ' posts ', true);
$betValue = substr(stristr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue;
}
else if(preg_match('/\S+ raises /i' , $line)) {
$player = stristr($line, ' raises ', true);
$betValue = substr(strstr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue; //total bet this hand (shortcut)
}
else if(preg_match('/\S+ bets /i' , $line)) {
$player = stristr($line, ' bets ', true);
$betValue = substr(strstr($line, '$'), 1);
$bettingMatrix[$player]['betTotal'] = $betValue; //total bet this hand (shortcut)
}
else if(preg_match('/\S+ calls /i' , $line)) {
$player = stristr($line, ' calls ', true);
$betValue = substr(stristr($line, '$'), 1);
$callValue = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount called
$bettingMatrix[$player]['betTotal'] = $betValue;
$line = stristr($line, '$', true)."\$".$callValue.$allin;
$allin = '';
}
else if(preg_match('/(\*\*\* (Flop|Turn|River))|(Full Tilt Poker)/i' , $line)) {
unset($bettingMatrix); //zero $betValue
}
else if ( FALSE !== stripos( $line, '*** FLOP ***' ) )
{
$flop = substr(stristr($line, '['), 0, -2);
$line = '*** FLOP *** '. $flop;
}
else if ( FALSE !== stripos( $line, '*** TURN ***' ) )
{
$turn = substr(stristr($line, '['), 0, -2);
$line = '*** TURN *** '. $flop .' '. $turn;
}
else if ( FALSE !== stripos( $line, '*** RIVER ***' ) )
{
$river = substr(stristr($line, '['), 0, -2);
$line = '*** RIVER *** '. substr($flop, 0, -1) .' '. substr($turn, 1) .' '. $river;
}
else {
}
fwrite($ourFileHandle, $line);
}
fclose( $handle );
fclose( $ourFileHandle );
}
精彩评论