开发者

grep -f alternative sed?awk?

file1 = 95000
file2 = 4500000

I want to filter out file1 entries from file2.

egrep -f file开发者_如何转开发1 file2

takes ages to complete. Is there an alternative ? sed? awk?

Thanks


sure, you can use awk. Put file2 entries into an array. Then iterate file1, each time finding those entries in the array.

awk 'FNR==NR{a[$0];next}($0 in a)' file2 file1

Play around with these options to get what you want

awk 'FNR==NR{a[$0];next}(!($0 in a))' file2 file1
awk 'FNR==NR{a[$0];next}(!($0 in a))' file1 file2
awk 'FNR==NR{a[$0];next}($0 in a)' file1 file2


I don't think grep -f is really meant to work with a filter file of that size so some sort of database backed solution might be your best bet.

You could load both files line-by-line into an SQLite database and then do a simple bit of SQL something like this:

SELECT line FROM file2
EXCEPT
SELECT line FROM file1

and dump them back out. You could do all of that straight from the command line with SQLite.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜