开发者

Merge CSVs Appending Headers

I have a number of CSV files with hundreds of columns and about 50,000 rows (when opened in Excel). The column headers are almost identical however some column headers may vary from one CSV file to the next, as an example below:-

CSV1

Name Surname DOB

John Smith 31/01/1989

CSV2

Name Age Surname Address DOB

Paul 29 Jones 123 Smith St 30/12/1981

CSV3

Name Surname Address Telephone

Mick Jones 123 Paul St 0123456

Is there any way I can merge all of these into one big CSV file, appending the headers so that in the one main CSV, I would have the headers "Name, Surname, DOB, Age, Address, Telephone" for example and then the respective entries from each CSV falling within their respective column heading. The reason I want to do this is to then populate the information into a big MySql / Sql Server DB table and so it appears easier t开发者_JS百科o do it all initially as one big CSV before importing.

Any suggestions?


Import them into three temporary tables and then merge them into one table using joins on name surname and DOB. Otherwise the data will get all mixed up.


Manual method (bear with me, just giving an idea of the algorithm):

  1. Generate a final list of columns that includes all possible headers in all CSV's.
  2. Open each spreadsheet, one at a time. For each spreadsheet:
  3. Click and drag the headers and insert missing columns so they all match your list from #1
  4. Save the file, and repeat back to #2
  5. Combine all the spreadsheets into a single spreadsheet.
  6. Import.

If you are going to automate this, you will take roughly the same steps. You need a way to determine what all columns are possible, then put the CSV's in the right format and combine them, either in spreadsheet/CSV format, or import them as a bunch of temp tables, and INSERT...SELECT to re-arrange the columns where they belong.

What languages/technologies do you have available to you for the automation? .NET? Java? PHP? How often will this process occur, and how automated does it have to be? Is it a daily process, or weekly, or only going to happen once? How many spreadsheets roughly?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜