grep -f alternative sed?awk?
file1 = 95000
file2 = 4500000
I want to filter out file1 entries from file2.
egrep -f file开发者_如何转开发1 file2
takes ages to complete. Is there an alternative ? sed? awk?
Thanks
sure, you can use awk
. Put file2
entries into an array. Then iterate file1
, each time finding those entries in the array.
awk 'FNR==NR{a[$0];next}($0 in a)' file2 file1
Play around with these options to get what you want
awk 'FNR==NR{a[$0];next}(!($0 in a))' file2 file1
awk 'FNR==NR{a[$0];next}(!($0 in a))' file1 file2
awk 'FNR==NR{a[$0];next}($0 in a)' file1 file2
I don't think grep -f
is really meant to work with a filter file of that size so some sort of database backed solution might be your best bet.
You could load both files line-by-line into an SQLite database and then do a simple bit of SQL something like this:
SELECT line FROM file2
EXCEPT
SELECT line FROM file1
and dump them back out. You could do all of that straight from the command line with SQLite.
精彩评论