开发者

.csv file issue with split lines in unix, not windows

got a problem I'm sure someone somewhere has encountered before. We'd been FTPing a customers .csv files down to our laptops, then SQLLoading them to our Oracle DB's, but network made it a slow process.. I set up a shell script to LFTP those files down to the Solaris DB box, and sqlload them - much faster. There were some character issues, so I was able to alter the NLS_LANG, and now see the same characters in the DB as when we go the windows route. 2 of these 7 files have issues..Of 500,000 records, a few thousand are written to a .bad file because lines are split. Curious that in Windows environment this doesnt happen. Not sure if this is a FTP vs. LFTP thing, or a charset transcription which occurs when coming into UNIX (MSWIN -> WE8ISO) Thought maybe there is a set variable which might be used to make LFTP behave more like FTP in this regard....Any Ideas there?

My band-aid alternative If I can't figure out the real problem above, is to reload the 2 .bad files after manipulating the split line back up onto the end of the previous line. Here's an example of a split record in the .bad file. They always seem to split at this address field, often times where there should have been a dot or a comma - see there at '215 St' the line breaks:

"","","1-1000035","","","1-1000035","SIS STRATEGIC INFORMATION SYSTEMS","SIS STRATEGIC INFORMATION SYSTEMS","","RESELLER","Active","N","Y","","","","","","$"
,"","","","","","","","80","","","","","","","","","","","","","(403) 281-4252","(780) 701-4050","North America","","","11432 215 St
Summerbarn Rd","","","Edmonton","AB","T2S3Y5","Canada","","","","","","1-1000035","","","","","",开发者_JS百科"","","","","","","",
"","","","","",,,,"",,0,"UPSERT",10,"Y","Inserted By Widget",2009-10-23 15:08:03.387000000,2009-10-23 15:08:03.387000000,"",,"",,"","","1-1000035"^M


Could it be the difference between Unix and Windows line endings (\n vs. \r\n)?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜