How to align 3 files based on first column value

2023-01-29 12:55 问答作者：

I have 3 text files c.dat, n.dat, and h.dat The contents are similar, in this format

c.dat    n.dat    h.dat
1 0.ccc  3 1.nnn  1 2.hhh
2 0.ccc  4 1开发者_运维百科.nnn  2 2.hhh
4 0.ccc  5 1.nnn  5 2.hhh

Desired output:

1 0.ccc Inf 2.hhh
2 0.ccc Inf 2.hhh
3 Inf 1.nnn Inf
4 0.ccc 1.nnn Inf
5 Inf 1.nnn 2.hhh
6 Inf Inf Inf
7 ....

Each file has ~100 rows, but they don't always start from 1, and don't aren't always consecutive.

I need to align the 3 files by the first column, such that if the other files don't have it, it's filled in with something like NA, or NaN, or Inf... anything.

Thanks!

awk '
{
        if(FNR==1){f++}
        a[$1,f] = $2
        if($1 > max){max = $1}
}

END{
        for(j=1;j<=max;j++){
          printf("%d\t", j)
          for(i=1;i<=f;i++){
            if(!a[j,i]){printf("Inf\t")}
            else{printf("%s\t", a[j,i])}
          }
          printf("\n")
        }
}' ./c.dat ./n.dat ./h.dat

Output

$ ./awk.dat
1       0.ccc   Inf     2.hhh
2       0.ccc   Inf     2.hhh
3       Inf     1.nnn   Inf
4       0.ccc   1.nnn   Inf
5       Inf     1.nnn   2.hhh

Kind of a dirty solution -- I created a perl script to loop from 1 to 100, preforming 3 grep statements, piping to awk and if that returns blank, then print Inf.

Still looking for a more elegant solution.

Pure Bash.

maxindex=0

while read idx val ; do                         # build array from c.dat
    c[idx]=$val
    [ $maxindex -lt $idx  ] && maxindex=$idx
done < 'c.dat'

while read idx val ; do                         # build array from n.dat
    n[idx]=$val
    [ $maxindex -lt $idx  ] && maxindex=$idx
done < 'n.dat'

while read idx val ; do                         # build array from h.dat
    h[idx]=$val
    [ $maxindex -lt $idx  ] && maxindex=$idx
done < 'h.dat'

for (( idx=1; idx<=$maxindex; idx+=1 )); do
    echo -e "$idx  ${c[idx]:-INF} ${n[idx]:-INF} ${h[idx]:-INF}"
done

man paste should give you the answer - paste -d ' ' file1 file2 file3

继续阅读：bash file sorting

How to align 3 files based on first column value

Output

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Output

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？