grep -f alternative sed?awk?

2023-03-01 01:53 问答作者：

file1 = 95000
file2 = 4500000

I want to filter out file1 entries from file2.

egrep -f file开发者_如何转开发1 file2

takes ages to complete. Is there an alternative ? sed? awk?

Thanks

sure, you can use awk. Put file2 entries into an array. Then iterate file1, each time finding those entries in the array.

awk 'FNR==NR{a[$0];next}($0 in a)' file2 file1

Play around with these options to get what you want

awk 'FNR==NR{a[$0];next}(!($0 in a))' file2 file1
awk 'FNR==NR{a[$0];next}(!($0 in a))' file1 file2
awk 'FNR==NR{a[$0];next}($0 in a)' file1 file2

I don't think grep -f is really meant to work with a filter file of that size so some sort of database backed solution might be your best bet.

You could load both files line-by-line into an SQLite database and then do a simple bit of SQL something like this:

SELECT line FROM file2
EXCEPT
SELECT line FROM file1

and dump them back out. You could do all of that straight from the command line with SQLite.

继续阅读：grep sed

精彩评论