Split a text file in PHP
How can I split a large text file into separate files by character count using PHP? So a 10,000 character file split every 1000 characters would be split into 10 files. Further, can I split only after a full stop is found?
Thanks.
UPDATE 1: I like zombats code and I removed some errors and have come up with the following, but does anyone know how to only split after a full stop?
$i = 1;
$fp = fopen("test.txt", "r");
while(! feof($fp)) {
$contents = fread($fp,1000);
file_put_contents('new_file_'.$i.'.txt', $contents);
$i++;
}
UPDATE 2: I took zombats suggestion and modified the code to that below and it seems to work -
$i = 1;
$fp = fopen("test.txt", "r");
while(! feof($fp)) {
$contents = fread($fp,20000);
$contents .= stream_get_line($fp,1000,".");
开发者_开发技巧 $contents .=".";
file_put_contents("Split/".$tname."/"."new_file_".$i.".txt", $contents);
$i++;
}
You should be able to accomplish this easily with a basic fread(). You can specify how many bytes you want to read, so it's trivial to read in an exact amount and output it to a new file.
Try something like this:
$i = 1;
$fp = fopen("test.txt",'r');
while(! feof($fp)) {
$contents = fread($fp,1000);
file_put_contents('new_file_'.$i.'.txt',$contents);
$i++;
}
EDIT
If you wish to stop after a certain amount of length OR on a certain character, then you could use stream_get_line() instead of fread()
. It's almost identical, except it allows you to specify any ending delimiter you wish. Note that it does not return the delimeter as part of the read.
$contents = stream_get_line($fp,1000,".");
There is a bug in run function; variable $split
not defined.
The easiest way is to read the contents of the file, split the content, then save to two other files. If your files are more than a few gigabytes, you're going to have a problem doing it in PHP due to integer size limitations.
You could also write a class to do this for you.
<?php
/**
* filesplit class : Split big text files in multiple files
*
* @package
* @author Ben Yacoub Hatem <hatem@php.net>
* @copyright Copyright (c) 2004
* @version $Id$ - 29/05/2004 09:02:10 - filesplit.class.php
* @access public
**/
class filesplit{
/**
* Constructor
* @access protected
*/
function filesplit(){
}
/**
* File to split
* @access private
* @var string
**/
var $_source = 'logs.txt';
/**
*
* @access public
* @return string
**/
function Getsource(){
return $this->_source;
}
/**
*
* @access public
* @return void
**/
function Setsource($newValue){
$this->_source = $newValue;
}
/**
* how much lines per file
* @access private
* @var integer
**/
var $_lines = 1000;
/**
*
* @access public
* @return integer
**/
function Getlines(){
return $this->_lines;
}
/**
*
* @access public
* @return void
**/
function Setlines($newValue){
$this->_lines = $newValue;
}
/**
* Folder to create splitted files with trail slash at end
* @access private
* @var string
**/
var $_path = 'logs/';
/**
*
* @access public
* @return string
**/
function Getpath(){
return $this->_path;
}
/**
*
* @access public
* @return void
**/
function Setpath($newValue){
$this->_path = $newValue;
}
/**
* Configure the class
* @access public
* @return void
**/
function configure($source = "",$path = "",$lines = ""){
if ($source != "") {
$this->Setsource($source);
}
if ($path!="") {
$this->Setpath($path);
}
if ($lines!="") {
$this->Setlines($lines);
}
}
/**
*
* @access public
* @return void
**/
function run(){
$i=0;
$j=1;
$date = date("m-d-y");
unset($buffer);
$handle = @fopen ($this->Getsource(), "r");
while (!feof ($handle)) {
$buffer .= @fgets($handle, 4096);
$i++;
if ($i >= $split) {
$fname = $this->Getpath()."part.$date.$j.txt";
if (!$fhandle = @fopen($fname, 'w')) {
print "Cannot open file ($fname)";
exit;
}
if (!@fwrite($fhandle, $buffer)) {
print "Cannot write to file ($fname)";
exit;
}
fclose($fhandle);
$j++;
unset($buffer,$i);
}
}
fclose ($handle);
}
}
?>
Usage Example
<?php
/**
* Sample usage of the filesplit class
*
* @package filesplit
* @author Ben Yacoub Hatem <hatem@php.net>
* @copyright Copyright (c) 2004
* @version $Id$ - 29/05/2004 09:14:06 - usage.php
* @access public
**/
require_once("filesplit.class.php");
$s = new filesplit;
/*
$s->Setsource("logs.txt");
$s->Setpath("logs/");
$s->Setlines(100); //number of lines that each new file will have after the split.
*/
$s->configure("logs.txt", "logs/", 2000);
$s->run();
?>
Source http://www.weberdev.com/get_example-3894.html
I had fix class and work perfect with .txt file.
<?php
/**
* filesplit class : Split big text files in multiple files
*
* @package
* @author Ben Yacoub Hatem <hatem@php.net>
* @copyright Copyright (c) 2004
* @version $Id$ - 29/05/2004 09:02:10 - filesplit.class.php
* @access public
**/
class filesplit{
/**
* Constructor
* @access protected
*/
function filesplit(){
}
/**
* File to split
* @access private
* @var string
**/
var $_source = 'logs.txt';
/**
*
* @access public
* @return string
**/
function Getsource(){
return $this->_source;
}
/**
*
* @access public
* @return void
**/
function Setsource($newValue){
$this->_source = $newValue;
}
/**
* how much lines per file
* @access private
* @var integer
**/
var $_lines = 1000;
/**
*
* @access public
* @return integer
**/
function Getlines(){
return $this->_lines;
}
/**
*
* @access public
* @return void
**/
function Setlines($newValue){
$this->_lines = $newValue;
}
/**
* Folder to create splitted files with trail slash at end
* @access private
* @var string
**/
var $_path = 'logs/';
/**
*
* @access public
* @return string
**/
function Getpath(){
return $this->_path;
}
/**
*
* @access public
* @return void
**/
function Setpath($newValue){
$this->_path = $newValue;
}
/**
* Configure the class
* @access public
* @return void
**/
function configure($source = "",$path = "",$lines = ""){
if ($source != "") {
$this->Setsource($source);
}
if ($path!="") {
$this->Setpath($path);
}
if ($lines!="") {
$this->Setlines($lines);
}
}
/**
*
* @access public
* @return void
**/
function run(){
$buffer = '';
$i=0;
$j=1;
$date = date("m-d-y");
$handle = @fopen ($this->Getsource(), "r");
while (!feof ($handle)) {
$buffer .= @fgets($handle, 4096);
$i++;
if ($i >= $this->getLines()) {
// set your filename pattern here.
$fname = $this->Getpath()."split_{$j}.txt";
if (!$fhandle = @fopen($fname, 'w')) {
print "Cannot open file ($fname)";
exit;
}
if (!@fwrite($fhandle, $buffer)) {
print "Cannot write to file ($fname)";
exit;
}
fclose($fhandle);
$j++;
unset($buffer,$i);
}
}
if ( !empty($buffer) && !empty($i) ) {
$fname = $this->Getpath()."split_{$j}.txt";
if (!$fhandle = @fopen($fname, 'w')) {
print "Cannot open file ($fname)";
exit;
}
if (!@fwrite($fhandle, $buffer)) {
print "Cannot write to file ($fname)";
exit;
}
fclose($fhandle);
unset($buffer,$i);
}
fclose ($handle);
}
}
?>
Usage Example
<?php
require_once("filesplit.class.php");
$s = new filesplit;
$s->Setsource("logs.txt");
$s->Setpath("logs/");
$s->Setlines(100); //number of lines that each new file will have after the split.
//$s->configure("logs.txt", "logs/", 2000);
$s->run();
?>
In the run function, I made the following adjustments, to fix the "split is not defined" warning.
function run(){
$buffer='';
$i=0;
$j=1;
$date = date("m-d-y");
$handle = @fopen ($this->Getsource(), "r");
while (!feof ($handle)) {
$buffer .= @fgets($handle, 4096);
$i++;
if ($i >= $this->getLines()) { // $split was here, empty value..
// set your filename pattern here.
$fname = $this->Getpath()."dma_map_$date.$j.csv";
if (!$fhandle = @fopen($fname, 'w')) {
print "Cannot open file ($fname)";
exit;
}
if (!@fwrite($fhandle, $buffer)) {
print "Cannot write to file ($fname)";
exit;
}
fclose($fhandle);
$j++;
unset($buffer,$i);
}
}
fclose ($handle);
}
I used this class to split a 500,000 line CSV file into 10 files, so phpmyadmin can consume it without timeout issues. Worked like a charm.
精彩评论