Do not merge the context of contiguous matches with grep
If I run grep -C 1 match
over the following file:
a
b
match1
c
d
e
match2
f
match3
g
I get the following o开发者_JS百科utput:
b
match1
c
--
e
match2
f
match3
g
As you can see, since the context around the contiguous matches "match2" and "match3" overlap, they are merged. However, I would prefer to get one context description for each match, possibly duplicating lines from the input in the context reporting. In this case, what I would like is:
b
match1
c
--
e
match2
f
--
f
match3
g
What would be the best way to achieve this? I would prefer solutions which are general enough to be trivially adaptable to other grep
options (different values for -A
, -B
, -C
, or entirely different flags). Ideally, I was hoping that there was a clever way to do that just with grep
....
I don't think it is possible to do that using plain grep.
the sed construct below works to some extent, now I only need to figure out how to add the "--" separator
$ sed -n -e '/match/{x;1!p;g;$!N;p;D;}' -e h log
b
match1
c
e
match2
f
f
match3
g
I don't think this is possible using plain grep.
Have you ever used Python? In my opinion it's a perfect language for such tasks (this code snippet will work for both Python 2.7 and 3.x):
with open("your_file_name") as f:
lines = [line.rstrip() for line in f.readlines()]
for num, line in enumerate(lines):
if "match" in line:
if num > 0:
print(lines[num - 1])
print(line)
if num < len(lines) - 1:
print(lines[num + 1])
if num < len(lines) - 2:
print("--")
This gives me:
b match1 c -- e match2 f -- f match3 g
I'd suggest to patch grep instead of working around it. In GNU grep 2.9 in src/main.cpp:
933 /* We print the SEP_STR_GROUP separator only if our output is
934 discontiguous from the last output in the file. */
935 if ((out_before || out_after) && used && p != lastout && group_separator)
936 {
937 PR_SGR_START_IF(sep_color);
938 fputs (group_separator, stdout);
939 PR_SGR_END_IF(sep_color);
940 fputc('\n', stdout);
941 }
942
A simple additional flag would suffice here.
Edit: Well, d'oh, it is of course not THAT simple since grep would not reproduce the context, just add a few more separators. Due to the linearity of grep, the whole patch is probably not that easy. Nevertheless, if you have a good case for the patch, it could be worth it.
This does not appear possible with grep or GNU grep. However it is possible with standard POSIX tools and a good shell like bash as leverage to obtain the desired output.
Note: neither python nor perl should be necessary for the solution. Worst case, use awk or sed.
One solution I rapidly prototyped is something like this (it does involve overhead of re-reading the file, and this solution depends on whether this overhead is OK, and the give-away is the original question's use of -1 as fixed number of lines of context which allows simple use of head & tail) :
$ OIFS="$IFS"; lines=`grep -n match greptext.txt | /bin/cut -f1 -d:`;
for l in $lines;
do IFS=""; match=`/bin/tail -n +$(($l-1)) greptext.txt | /bin/head -3`;
echo $match; echo "---";
done; IFS="$OIFS"
This might have some corner case associated with it, and this resets IFS when perhaps not necessary, though it is a hint for trying to use the power of POSIX shell & tools rather than a high level interpreter to get the desired output.
Opinion: All good operating systems have: grep, awk, sed, tr, cut, head, tail, more, less, vi as built-ins. On the best operating systems, these are in /bin.
精彩评论