Unix uniq utility: What is wrong with this code?
What I want to accomplish: print duplicated lines
This is what uniq man says:
SYNOPSIS
uniq [OPTION]... [INPUT [OUTPUT]]
DESCRIPTION
Discard all but one of successive identical lines from INPUT (or stan-
dard input), writing to OUTPUT (or standard output).
...
-d, --repeated
only print duplicate lines
This is what I try to execute:
root@laptop:/var/www# cat file.tmp
Foo
Bar
Foo
Baz
Qux
root@laptop:/var/www# cat file.tmp | uniq --repeated
root@laptop:/var/www#
So I was waiting for Foo
in this example but it returns noth开发者_StackOverflow中文版ing..
What is wrong with this snippet?
uniq
only checks consecutive lines against each other. So you can only expect to see something printed if there are two or more Foo
lines in a row, for example.
If you want to get around that, sort the file first with sort
.
$ sort file.tmp | uniq -d
Foo
If you really need to have all the non-consecutive duplicate lines printed in the order they occur in the file, you can use awk
for that:
$ awk '{ if ($0 in lines) print $0; lines[$0]=1; }' file.tmp
but for a large file, that may be less efficient than sort
and uniq
. (May be - I haven't tried.)
cat file.tmp | sort | uniq --repeated
or
sort file.tmp | uniq --repeated
cat file.tmp | sort | uniq --repeated
the lines needs to be sorted
uniq
operates on adjacent lines. what you want is
cat file.tmp | sort | uniq --repeated
On OS X, I actually would have
sort file.tmp | uniq -d
I've never tried this myself, but I think the word "successive" is the key.
This would probably work if you sorted the input before running uniq
over it.
Something like
sort file.tmp | uniq -d
精彩评论