csv file filtering
I have a .csv file with a header row like so;
headerA,headerB,headerC
bill,jones,p
mike,smith,f
sally,silly,p
开发者_如何学C
I'd like to filter out any records with the f value in the headerC column.
Can I do that with sed or awk?
If header does not contains only f
at the third columns name:
sed '/,f$/d' FILE
will do (deletes every line from the input if it ends with ,f
).
If it has, I'd go with:
sed -n -e '1p;/,[^f]$/p' FILE
(Does not print anything by default (-n
) but the 1st line must 1p
, and if the lines are ends with other char than f
... Note: this will not work, if the 3rd columnc contains more than one char.)
And an awk
one:
awk -F, 'NF == 1 ; NF > 1 && $3 != "f"' FILE
(This always prints the first line (NF == 1
is true, then default action, which is print $0
, then the next condtitions are checking if we had got over the 1st line, and the 3rd field is not f
then default action...)
HTH
well, if you know that headerC
is always in the third column, the following sed command would work:
sed -r '/[^,]+(,[^,]+){1},f/ d' < file.csv > filefiltered.csv
And the following awk command does the same:
awk 'BEGIN {FS=","} {if($3 != "f") print}' file.csv
If you don't know headerC
is always in a particular column it gets a little more tricky. Does this work?
A bit unclear, is this what you are asking for?
$ awk -F, '{ if($3 == "f")print}' input
mike,smith,f
With a header and formatted using column
$ awk -F, '{ if (NR == 1)print}{if($3 == "f")print}' input | column -t -s,
headerA headerB headerC
mike smith f
grep works, look at example.
grep ",.*,.*f" << EOF
headerA,headerB,headerC
bill,josef,p
mike,smith,f
sally,silly,p
EOF
outputs:
mike,smith,f
no need for sed or awk, this can be done with more simpler commands like cut and grep piped together like this
cut -d"," -f 3| grep -i f
I am assuming the delimiter is coma and Column c is thrid one. if it si not change the values above appropriately. And i have used grep with i option so that it ignore case. If you want to match only lowercse f or upppercase f then remove the i option and change it accordingly.
精彩评论