Deleting Column Selection in AWK
I'd like to delete a selection of columns from a list of CSV files. The awk call is in-line as it is used in a shell script. I don't know beforehand how many columns the files have, only that the columns that I want gone are included in each file of the list.
Let's say I want the first 4 columns removed. Blanking out the column values will leave the separators, which I also want gone.
I though the following would work: create an array of column numbers to drop, and recreate the corresponding row without those columns.
The value of length(row) below is as expected, but the final loop still iterates over the original column count, not the actual length(row) value.
head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}' > $g
or formatted:
head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub开发者_运维问答(/[ \t]*$/,"",out);print out}' > $g
Here's the output for 2 files: 6 columns going in, 2 left when I've deleted columns 1 through 4, yet the loop iterates over the full 6 cols rather than the expected 2. Thank you for any advice.
Aust.
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000009lulu
461474lulu
,,,,,0000009,461474
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000010lulu
94942lulu
,,,,,0000010,94942
Edit (Belisarius)
Formatted code follows:BEGIN {FS=",";
split("1,2,3,4",dropers,",")
}
{ split($0,row,FS);
for(i in dropers) delete row[i];
print NF "," length(row) "<<<";
out="";
print NF "," length(row) ">>>";
for(i=1;i<=length(row);i++){print row[i] "lulu";
out = out "," row[i]};
sub(/[ \t]*$/,"",out);
print out
}
BEGIN {FS=",";
split("1,2,3,4",dropers,",")
}
{ split($0,row,FS);
for(i in dropers) delete row[i];
print NF "," length(row) "<<<";
out="";
print NF "," length(row) ">>>";
for(i in row){print row[i] "lulu";
out = out "," row[i]};
out = substr(out,2)
sub(/[ \t]*$/,"",out);
print out
}
with input:
a,b,c,d,e,f,g
prints:
7,3<<<
7,3>>>
elulu
flulu
glulu
e,f,g
精彩评论