开发者

string replace in a large file with php

I am trying开发者_如何转开发 to do a string replace for entire file in PHP. My file is over 100MB so I have to go line by line and can not use file_get_contents(). Is there a good solution to this?


If you aren't required to use PHP, I would highly recommend performing stuff like this from the command line. It's by far the best tool for the job, and much easier to use.

In any case, the sed (Stream Editor) command is what you are looking for:

sed s/search/replace oldfilename > newfilename

If you need case-insensitivity:

sed s/search/replace/i oldfilename > newfilename

If you need this to perform dynamically within PHP, you can use passthru():

$output = passthru("sed s/$search/$replace $oldfilename > $newfilename");


Here you go:

function replace_file($path, $string, $replace)
{
    set_time_limit(0);

    if (is_file($path) === true)
    {
        $file = fopen($path, 'r');
        $temp = tempnam('./', 'tmp');

        if (is_resource($file) === true)
        {
            while (feof($file) === false)
            {
                file_put_contents($temp, str_replace($string, $replace, fgets($file)), FILE_APPEND);
            }

            fclose($file);
        }

        unlink($path);
    }

    return rename($temp, $path);
}

Call it like this:

replace_file('/path/to/fruits.txt', 'apples', 'oranges');


If you can't use directly sed from command line because it's a dynamic task and you need to call it from php it's difficult to get the syntax right: you must escape in different ways in the search and replacement strings these characters

' / $ . * [ ] \ ^ &

The following function search and replace a string in a file without interpreting the searched string as a regular expression. So if you wanted you could search for the string ".*" and replace it with "$".

/**
 * str_replace_with_sed($search, $replace, $file_in, $file_out=null)
 * 
 * Search for the fixed string `$search` inside the file `$file_in`
 * and replace it with `$replace`. The replace occurs in-place unless
 * `$file_out` is defined: in that case the resulting file is written
 * into `$file_out`
 *
 * Return: sed return status (0 means success, any other integer failure)
 */
function str_replace_with_sed($search, $replace, $file_in, $file_out=null)
{
    $cmd_opts = '';
    if (! $file_out) 
    {
        // replace inline in $file_in
        $cmd_opts .= ' -i';
    }

    // We will use Basic Regular Expressions (BRE). This means that in the 
    // search pattern we must escape
    // $.*[\]^
    //
    // The replacement string must have these characters escaped
    // \ & 
    //
    // In both cases we must escape the separator character too ( usually / )
    // 
    // Since we run the command trough the shell we We must escape the string
    // too (yai!). We're delimiting the string with single quotes (') and we'll
    // escape them with '\'' (close string, write a single quote, reopen string)    

    // Replace all the backslashes as first thing. If we do it in the following
    // batch replace we would end up with bogus results
    $search_pattern = str_replace('\\', '\\\\', $search);

    $search_pattern = str_replace(array('$', '.', '*', '[', ']', '^'),
                                  array('\\$', '\\.', '\\*', '\\[', '\\]', '\\^'),
                                  $search_pattern);

    $replace_string = str_replace(array('\\', '&'),
                                  array('\\\\', '\\&'),
                                  $replace);

    $output_suffix = $file_out ? " > '$file_out' " : '';
    $cmd = sprintf("sed ".$cmd_opts." -e 's/%s/%s/g' \"%s\" ".$output_suffix,
                    str_replace('/','\\/', # escape the regexp separator
                      str_replace("'", "'\''", $search_pattern) // sh string escape
                    ),
                    str_replace('/','\\/', # escape the regexp separator
                      str_replace("'", "'\''", $replace_string) // sh string escape
                    ),
                    $file_in
                  );

    passthru($cmd, $status);

    return $status;
}


I would have used 'sed' in a more explicit way, so you are less dependent of your system.

$output = passthru("sed -e 's/$search/$replace/g' $oldfilename > $newfilename");


Get it a few lines at a time, dump the variable, get the next few lines.

$fh = fopen("bigfile.txt", "flags");
$num = 0;
$length = 300;
$filesize = filesize("bigfile.txt");

while($num < $filesize)
{
     $contents = fread($fh, $length);
     // .. do stuff ...
     $num = $num+$length;
     fseek($fh, $num);
}

fclose($fh);

You are going to want to make sure that is correct (haven't tested). See the library on PHP Documentation.

The tricky part is going to be writing back to the file. The first idea that pops into my mind is do the string replace, write the new content to another file, and then at the end, delete the old file and replace it with the new one.


something like this?

$infile="file";
$outfile="temp";
$f = fopen($infile,"r");
$o = fopen($outfile,"a");
$pattern="pattern";
$replace="replace";
if($f){
     while( !feof($f) ){
        $line = fgets($f,4096);
        if ( strpos($pattern,"$line") !==FALSE ){
            $line=str_replace($pattern,$replace,$line);
        }
        fwrite($o,$line);
     }
}
fclose($f);
fclose($o);
rename($outfile,$infile);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜