.csv file issue with split lines in unix, not windows
got a problem I'm sure someone somewhere has encountered before. We'd been FTPing a customers .csv files down to our laptops, then SQLLoading them to our Oracle DB's, but network made it a slow process.. I set up a shell script to LFTP those files down to the Solaris DB box, and sqlload them - much faster. There were some character issues, so I was able to alter the NLS_LANG, and now see the same characters in the DB as when we go the windows route. 2 of these 7 files have issues..Of 500,000 records, a few thousand are written to a .bad file because lines are split. Curious that in Windows environment this doesnt happen. Not sure if this is a FTP vs. LFTP thing, or a charset transcription which occurs when coming into UNIX (MSWIN -> WE8ISO) Thought maybe there is a set variable which might be used to make LFTP behave more like FTP in this regard....Any Ideas there?
My band-aid alternative If I can't figure out the real problem above, is to reload the 2 .bad files after manipulating the split line back up onto the end of the previous line. Here's an example of a split record in the .bad file. They always seem to split at this address field, often times where there should have been a dot or a comma - see there at '215 St' the line breaks:
"","","1-1000035","","","1-1000035","SIS STRATEGIC INFORMATION SYSTEMS","SIS STRATEGIC INFORMATION SYSTEMS","","RESELLER","Active","N","Y","","","","","","$"
,"","","","","","","","80","","","","","","","","","","","","","(403) 281-4252","(780) 701-4050","North America","","","11432 215 St
Summerbarn Rd","","","Edmonton","AB","T2S3Y5","Canada","","","","","","1-1000035","","","","","",开发者_JS百科"","","","","","","",
"","","","","",,,,"",,0,"UPSERT",10,"Y","Inserted By Widget",2009-10-23 15:08:03.387000000,2009-10-23 15:08:03.387000000,"",,"",,"","","1-1000035"^M
Could it be the difference between Unix and Windows line endings (\n vs. \r\n)?
精彩评论