(sed/awk) How to convert a field-delimited (like a csv) file into a txt with fixed-sized tab-delimited columns?
I have something like a csv file in which the field delimiter is "@".
ID@Name@Surname@Age@Profession@Address
1254343123@John@Smith@24@Engineer@Washington 23@Alexander@Kristofferson-Brown@Economic Advisor@Kent ...
I want to convert it into something more human-readable as in:
1254343123 John Smith 24 Engineer Washington 23 Alexander Kristofferson-Brown开发者_开发问答 35 Economic Advisor Kent
...where each column stars at a specific character.
I have tried something that takes advantage of the TAB-size value and adds several TABs to the field:
sed -e "{/@[^@]\{32,\}@/s/@\([^@]*\)@/\t\1\t/g};{/@[^@]\{24,31\}@/s/@\([^@]*\)@/\t\1\t/g};{/@[^@]\{16,23\}@/s/@\([^@]*\)@/\t\1\t\t/g};{/@[^@]\{8,15\}@/s/@\([^@]*\)@/\t\1\t\t/g};{/@[^@]\{2,7\}@/s/@\([^@]*\)@/\t\1\t\t\t/g}"
...which does not work in all cases.
Could somebody give me a hint as how to proceed?
PS: My intention is to use mainly sed (a one-liner) or if sed just doesn't cut it, awk is OK.
BSD, Mac OS X and Linux have the column
command for this:
column -t -s@
It produces spaces though, not tabs (and it should, because tabs are obsolete). Output:
1254343123 John Smith 24 Engineer Washington
23 Alexander Kristofferson-Brown 35 Economic Advisor Kent
awk -F@ '{for(i=1;i<=NF;i++){printf "%-20s", $i};printf "\n"}' input.csv
Input
$ cat input.csv
1254343123@John@Smith@24@Engineer@Washington
23@Alexander@Kristofferson-Brown@35@Economic Advisor@Kent
Output
$ awk -F@ '{for(i=1;i<=NF;i++){printf "%-20s", $i};printf "\n"}' input.csv
1254343123 John Smith 24 Engineer Washington
23 Alexander Kristofferson-Brown 35 Economic Advisor Kent
If you want to make the field width (20 in the code above) a shell variable that can be passed in you do something like this:
#!/bin/bash
fldwth=20
awk -v fw=$fldwth -F@ '{for(i=1;i<=NF;i++){printf "%-*s", fw,$i};printf "\n"}' input.csv
My solution for converting the excel default csv (comma delimited, text enclosed in double quotes) is the following awk script:
#!/bin/nawk -f
# Q&D to transform csv (with commas imbedded in quotes) to pipe (|)
# Usage: cma2pipe.awk <in.csv> > <out.csv>
# Note: Assumes that <in.csv> contains no ~ or |
{#MAIN
s=$0;c=0;f=0; #reset varibles for a line
while (c<length(s)){ #loop thru line
c++; #char counter
a=substr(s,c,1); #get current character
if (a=="\"")f++; #flag quote
if (f%2==1&&a==",")#if inside pair of quotes, look for ","
s= repl("~",c,s);#replace commas with ~
}#end while c
gsub(",","|",s); #replace remaining , with |
gsub("~",",",s); #put commas back
gsub("\"","",s); #get rid of quotes
print s
}#end MAIN
function repl(r,n,t){ #replace single character in string
s1=substr(t,1,n-1); #get first part of string
s2=substr(t,n+1); #get last part of string
return(s1 r s2); #return changed string
}#end repl()
awk -F@ '{print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6}' file.csv > readable.txt
Try this one using delimiter option for awk and printing columns by putting \t
between each.
精彩评论