How can I find lines in one file but not the other using bash scripting?

2023-03-25 12:22 问答作者：

Imagine file 1:

#include "first.h"
#include "second.h"
#include "third.h"

// more code here
...

开发者_运维技巧

Imagine file 2:

#include "fifth.h"
#include "second.h"
#include "eigth.h"

// more code here
...

I want to get the headers that are included in file 2, but not in file 1, only those lines. So, when ran, a diff of file 1 and file 2 will produce:

#include "fifth.h"
#include "eigth.h"

I know how to do it in Perl/Python/Ruby, but I'd like to accomplish this without using a different programming language.

This is a one-liner, but does not preserve the order:

comm -13 <(grep '#include' file1 | sort) <(grep '#include' file2 | sort)

If you need to preserve the order:

awk '
  !/#include/ {next} 
  FILENAME == ARGV[1] {include[$2]=1; next} 
  !($2 in include)
' file1 file2

If it's ok to use a temp file, try this:

grep include file1.h > /tmp/x && grep -f /tmp/x -v file2.h | grep include

This

extracts all includes from file1.h and writes them to the file /tmp/x
uses this file to get all lines from file2.h that are not contained in this list
extracts all includes from the remainder of file2.h

It probably doesn't handle differences in whitespace correctly etc, though.

EDIT: to prevent false positives, use a different pattern for the last grep (thanks to jw013 for mentioning this):

grep include file1.h > /tmp/x && grep -f /tmp/x -v file2.h | grep "^#include"

This variant requires an fgrep with the -f option. GNU grep (i.e. any Linux system, and then some) should work fine.

# Find occurrences of '#include' in file1.h
fgrep '#include' file1.h |
# Remove any identical lines from file2.h
fgrep -vxf - file2.h |
# Result is all lines not present in file1.h.  Out of those, extract #includes
fgrep '#include'

This does not require any sorting, nor any explicit temporary files. In theory, fgrep -f could use a temporary file behind the scenes, but I believe GNU fgrep doesn't.

If the goal need not be accomplished with Bash alone (i.e., use of external programs is acceptable), then use combine from moreutils:

combine file1 not file2 > lines_in_file1_not_in_file2

cat $file1 $file2 | grep '#include' | sort | uniq -u

继续阅读：bash shell

How can I find lines in one file but not the other using bash scripting?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？