开发者

Unix uniq utility: What is wrong with this code?

What I want to accomplish: print duplicated lines

This is what uniq man says:

SYNOPSIS

uniq [OPTION]... [INPUT [OUTPUT]]

DESCRIPTION

Discard all but one of successive identical lines from INPUT (or stan-
dard input), writing to OUTPUT (or standard output).

...

-d, --repeated
  only print duplicate lines

This is what I try to execute:

root@laptop:/var/www# cat file.tmp 
Foo
Bar
Foo
Baz
Qux
root@laptop:/var/www# cat file.tmp | uniq --repeated
root@laptop:/var/www# 

So I was waiting for Foo in this example but it returns noth开发者_StackOverflow中文版ing.. What is wrong with this snippet?


uniq only checks consecutive lines against each other. So you can only expect to see something printed if there are two or more Foo lines in a row, for example.

If you want to get around that, sort the file first with sort.

$ sort file.tmp | uniq -d
Foo

If you really need to have all the non-consecutive duplicate lines printed in the order they occur in the file, you can use awk for that:

$ awk '{ if ($0 in lines) print $0; lines[$0]=1; }' file.tmp

but for a large file, that may be less efficient than sort and uniq. (May be - I haven't tried.)


cat file.tmp | sort | uniq --repeated

or

sort file.tmp | uniq --repeated


cat file.tmp | sort | uniq --repeated

the lines needs to be sorted


uniq operates on adjacent lines. what you want is

cat file.tmp | sort | uniq --repeated

On OS X, I actually would have

sort file.tmp | uniq -d


I've never tried this myself, but I think the word "successive" is the key.

This would probably work if you sorted the input before running uniq over it.

Something like

sort file.tmp | uniq -d
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜