开发者

linux, Comma Separated Cells to Rows Preserve/Aggregate Column

There was a similar question here but for excel/vba Excel Macro - Comma Separated Cells to Rows Preserve/Aggregate Column because i have a big file (>300mb) this is not an option, thus I am struggeling to get it to work in bash.

Based on this data

 1   Cat1                 a,b,c
 2   Cat2                 d
 3   Cat3                 e
 4   Cat4                 f,g

I would like to convert it to:

 1   Cat1                 a
 1   Cat1   开发者_如何学运维              b
 1   Cat1                 c
 2   Cat2                 d
 3   Cat3                 e
 4   Cat4                 f
 4   Cat4                 g


cat > data << EOF
1   Cat1                 a,b,c
2   Cat2                 d
3   Cat3                 e
4   Cat4                 f,g
EOF

set -f                               # turn off globbing
IFS=,                                # prepare for comma-separated data
while IFS=$'\t' read C1 C2 C3; do    # split columns at tabs
    for X in $C3; do                 # split C3 at commas (due to IFS)
        printf '%s\t%s\t%s\n' "$C1" "$C2" "$X"
    done
done < data


This looks like a job for awk or perl.

awk 'BEGIN { FS = OFS = "\t" }
     { split($3, a, ",");
       for (i in a) {$3 = a[i]; print} }'
perl -F'\t' -alne 'foreach (split ",", $F[2]) {
                       $F[2] = $_; print join("\t", @F)
                   }'

Both programs are based on the same algorithm: split the third column at commas, and iterate over the components, printing the original line with each component in the third column in turn.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜