AWK/BASH: How to remove duplicate rows from file with known field range?

2023-01-20 13:44 问答作者：

I was wondering if there was a way to use bash开发者_运维百科/awk to remove duplicate rows based on a known field range. For example:

Easy Going                  USA:22 May 1926
Easy Going Gordon               USA:6 August 1925   
Easy Life                   USA:20 May 1944
Easy Listening                  USA:14 January 2002 
Easy Listening                  USA:10 October 2002 
Easy Listening                  USA:27 January 2004 
Easy Living                     USA:7 July 1937 
Easy Living                     USA:16 July 1937
Easy Living                     USA:4 September 2009

I would like to remove duplicate move titles. The movie title will always be from $1 through $(NF-3). Ideally I would like to stick with the first occurrence (earliest date), but if that's not possible then it doesn't matter.

Thanks,

Tomek

#!/bin/bash

awk 'BEGIN{
   m=split("January|February|March|April|May|June|July|August|September|October|November|December",d,"|")
   for(o=1;o<=m;o++){
      months[d[o]]=sprintf("%02d",o)
   }
}
{
   sub(/.*:/,"",$(NF-2))
   t=mktime($(NF)" "months[$(NF-1)]" "$(NF-2)" 0 0 0")
   time[t]=$(NF-2) FS $(NF-1) FS $(NF)
   $(NF-2)=$(NF-1)=$(NF)=""
   gsub(/ +$/,"")
   if (!($0 in array)){array[$0]=99999999999999}
   if ( t <= array[$0] ){ array[$0]=t }
}
END{
  for(i in array){ print "->",i,time[array[i]]  }
} ' file

output

$ ./shell.sh
-> Easy Living 7 July 1937
-> Easy Going Gordon 6 August 1925
-> Easy Listening 14 January 2002
-> Easy Going 22 May 1926
-> Easy Life 20 May 1944

awk '
    {
        line = $0
        $(NF-2) = $(NF-1) = $NF = ""
        if ( ! ($0 in movies)) 
            movies[$0] = line
    }
    END {
        for (m in movies) print movies[m] 
    }
' movies.txt

That does not preserve the original line ordering. You might want to sort the output.

This could be a quick answer

sort -t':' -k1,1 -u your-file

继续阅读：bash parsing shell

AWK/BASH: How to remove duplicate rows from file with known field range?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？