Re-encode an entire CSV file before parsing - using simple PHP?
I have a script that reads the contents of a remote CSV file, iterates over the lines, and adds the data items to a database. This file has on average about 3000 lines, and therefore 3000 products.
To make a few things clear:
- I DO NOT have control over the data in the CSV file beforehand
- I DO NOT have access to / control over the manner in which this CSV file is cretaed
- The CSV file is dynamically generated once a day, from data in a MySQL database
The problem:
My script only iterates over about 1300 lines then s开发者_运维知识库tops, no errors, nothing. All text is enclosed in double quotes, and generally the CSV file seems correctly formatted. The weird thing is this: If I download the CSV file, open it in Notepad++ and change the encoding to UTF-8 WITHOUT BOM, upload that to a test server and run my script on THAT file, I get the FULL 3000 items and all is fine.
So, I am assuming that the people generating this file need to insert the data as UTF-8? Because I cannot control that process, I would like to know if there is a fairly simple manner in which I can apply the UTF-8 WITHOUT BOM encoding to that file, or at least read the file contents into a variable and re-encode that?
Many thanks
You can use iconv
to change the encoding directly from php before you process your file.
Edit: The php version of iconv can be used to process the data. If you want to re-encode the file before importing it, you'd have to use the linux command iconv (assuming a LAMP server) using for example exec.
sounds like you are trying to do this directly from the other server. why dont you get the entire file and save it to your own server, do any manipulation to that and then do your processing?
精彩评论