Parsing a CSV file in UNIX , but also handling data within " "
I am trying to parse a CSV file in UNIX using AWK or shell scripting. But I am facing a issue here. If the data is within quotes(",") then I want to replace the comma(,) with a blank space and remove the quotes. Also , such data might occur multiple times in one single record.
For eg: Consider this input
20,Manc开发者_如何学Chester,"Barclays,League",xyz,123,"95,some,data",
the output should be as follows
20,Manchester,Barclays League,xyz,123,95 some data,
How can it be done b basic UNIX commands or scripting. Please help me on this ....
<input.csv python -c \
'import csv,sys;f=csv.reader(sys.stdin);print '\
'("\n".join(",".join(entry.replace(",", " ") for entry in line) for line in f))'
Here's how you do it using sed
in shell:
sed -i '.orig' -e ':a' -e 's/^\([^"]*\)"\([^,"]*\)"\(.*\)$/\1\2\3/g' \
-e 's/^\([^"]*\)"\([^,"]*\),\([^"]*\)"\(.*\)$/\1"\2 \3"\4/;ta' file.csv
精彩评论