PHP fgetcsv() not reading all lines
I have a php script that is reading a remote CSV file, and adding products to a database based on the contents of the CSV file. At present there are about 2800 lines (products) but the script keeps stopping at line 1388.
The code I used is as follows:
while(($data = fgetcsv($fopen, 0, ",")) !== false):
//stuff is done here...
endwhile;
I have set the php memory limit to 64M and even tried 128M. I also set the max_execution_time to 60mins. I have also tried altering the code as follows:
while(($data = fgetcsv($fopen, 1000, ",", '\r')) !== false):
//stuff is done here...
endwhile;
That DID result in more lines being parsed, BUT the data was then incorrect, i.e. image columns were becoming description columns etc. I assume that has to do with adding \r as my line ending. I tried \n, no luck. Lastly, I also added the auto_detect_line_endings as true in the ini.
Can anyone suggest reasons as to why my data is being cut short?
Regards, Simon
EDIT
I have noticed something interesting. I have a MySQL insert on each line that is looped over in the above code. Now, the last record in my database is the FIRST row in the CSV file, does this mean the file is being parsed from the last line up??
These seem to be the rows at or near开发者_高级运维 the break:
W-3066, I Love Love Cheap And Chic, Moschino, 3.4 oz,EDT Spray,Women,,"Introduced by the design house of Moschino, I love love has a blend of grapefruit, orange, lemon, red currant, tea rose, cinnamon leaves, musk, cedar and tonka wood. It is recommended for daytime wear.",http://www.perfume-worldwide.com/products/Women/Final/W-3066large.jpg,0,0,0,8011003991457
W-3070, Adidas Floral Dream, Adidas, 1.7 oz,EDT Spray,Women,,"Introduced in 2008, the notes are bergamot, lily, rose, tonka bean and vanilla.",http://www.perfume-worldwide.com/products/Women/Final/W-3070large.jpg,0,0,0,3412244310024
W-3071, Adidas Fruity Rhythm, Adidas, 1.7 oz,EDT Spray,Women,,"Introduced in 2008, the notes are black currant, raspberry, cyclamen, freesia and musk.",http://www.perfume-worldwide.com/products/Women/Final/W-3071large.jpg,0,0,0,3412244510004
SOLUTION
As it turns out, it worked out a lot better for me to copy the file to my server, and work off the copy. The steps I followed are as follows:
- I read the contents of the remote file using
file_get_contents()
- I then used
iconv(
) function to re-encode data to UTF-8 - I made a temp file using
fopen()
,fwrite()
andfclose()
functions, contents of the file was the encoded data above - I set the permissions of the file to 0750 using the
chmod()
function - I then applied the
fgetcsv()
function to my temp file - Did all that needed to be done
- Deleted the temp file once done, using
unlink()
function
That did the trick. So, I suspect half the issue was actually the remote server timing out, and the other half encoding issues.
Thank you to everyone for all the nudges in the right direction
Firstly i have some questions for you:
- What is on line 1388, 1388 and 1389
- Is there any errors being outputted
- When you reach the final line, do you get an (
$data[0] === null
)
You information regarding the memory limit would probably not be the issue that's causing it, as fgetcsv reads a single line per iteration, there is only ever 1 line's worth of data in the memory at one time.
Within your lop if your keep placing data into an array, or concatenating them together. this may cause a memory leak but you would have to show more in depth code
A CSV File has to be pretty structured for the fgetcsv
to be able to parse it correctly, some rules to remember when using CSV Files:
- The first line must always be the column names
- All other lines are the data lines:
- Each element should be separated by a
,
- If a element contains a space or a comma,
'\n'
,'\r'
,'\r\n'
, it should be wrapped in double quotes
- Each element should be separated by a
An example of a valid CSV File should be like so:
id, firstname, lastname, age, profile_description
0, Robert, Pitt, 22, "this string has spaces, and has a comma"
You should validate the the structure is correct, if it is not correct then you should fix this until the parse is able to read the data correctly, you can then cleanly place the data into a new CSV File taking care of all the little incorrect structures.
is the file correctly formated? have you tried to open the file it in some csv reader in which you can specify delimiters and end lines)? Judging by this:
That DID result in more lines being parsed, BUT the data was then incorrect, i.e. image columns were becoming description columns etc
I would assume that data maybe is corrupted (i.e. some description had comma, endline, etc) It happneds if data is generated dynamically and not formatted correctly.
open in txt editor as well (i.e notepad++) and see how that goes/looks..
精彩评论