(shell) How to remove strings from one file which can be found in another file?
开发者_开发知识库file1.txt
aaaa
bbbb
cccc
dddd
eeee
file2.txt
DDDD
cccc
aaaa
result
bbbb
eeee
If it could be case insensitive it would be even more great!
Thank you!
grep
can match patterns read from a file, and print out all lines NOT matching that pattern. Can match case insensitively too, like
grep -vi -f file2.txt file1.txt
Excerpts from the man pages:
SYNOPSIS
grep [OPTIONS] PATTERN [FILE...]
grep [OPTIONS] [-e PATTERN | -f FILE] [FILE...]
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero
patterns, and therefore matches nothing. (-f is specified by POSIX.)ns zero
patterns, and therefore matches nothing. (-f is specified by POSIX.)
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input files. (-i is
specified by POSIX.)ions in both the PATTERN and the input files. (-i is
specified by POSIX.)
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v is
specified by POSIX.)of matching, to select non-matching lines. (-v is
specified by POSIX.)
From the top of my head, use grep -Fiv -f file2.txt < file1.txt
.
-F no regexps (fast)
-i case-insensitive
-v invert results
-f <pattern file> get patterns from file
$ grep -iv -f file2 file1
bbbb
eeee
or you can use awk
awk 'FNR==NR{ a[tolower($1)]=$1; next }
{
s=tolower($1)
f=0
for(i in a){if(i==s){f=1}}
if(!f) { print s }
} ' file2 file1
ghostdog74's awk example can be simplified:
awk '
FNR == NR { omit[tolower($0)]++; next }
tolower($0) in omit {next}
{print}
' file2 file1
For various set operations on files see: http://www.pixelbeat.org/cmdline.html#sets
In your case the inputs are not sorted, so you want the difference like:
sort -f file1 file1 file2 | uniq -iu
精彩评论