开发者

How to extract text between two words in unix?

I

am

using

basic

sed

expression :-

sed -n "am/,/sed/p" 

to get the text between "am" and "sed" which will output "am \n using \n basic \n sed". But my real problem is if the string would be :-

I

am

using

basic

grep

expression.

I applied the above sed in this sentence then it gave "am \n using \n basic \n grep \n expression" which it should not give it. How to discard the o开发者_如何学Cutput if there would be no matching?

Any suggestions?


The command in the question (sed -n "/am/,/sed/p", note the added slash) means:

  • Find a line containing the string am
  • and print (p) until a line containing sed occurs

Therefore it prints:

I am using basic grep expression

because it contains am. If you would add some more lines they will be printed, too, until a line containing sed occurs.

E.g.:

echo -e 'I am using basic grep expression.\nOne more line\nOne with sed\nOne without' | sed -n "/am/,/sed/p"

results in:

I am using basic grep expression.
One more line
One with sed

I think - what you want to do is something like that:

sed -n "s/.*\(am.*sed\).*/\1/p"

Example:

echo 'I am using basic grep expression.' | sed -n "s/.*\(am.*sed\).*/\1/p"

echo 'I am using basic sed expression.' | sed -n "s/.*\(am.*sed\).*/\1/p"
sed -n "s/.*\(am.*sed\).*/\1/p"


You have to use slightly different sed command like:

sed -n '/am/{:a; /am/x; $!N; /sed/!{$!ba;}; /sed/{s/\n/ /gp;}}' file

To print ONLY lines that contain text am and sed spanned across multiple lines.


When Using SED this can work but it's quite an overwhelming syntax... if you need to crop part of a multi-line (\n) text, you might want to try a simpler way using grep:

cat multi_line.txt | grep -oP '(?s)(?<=START phrase).*(?=END phrase)'

For example, I find this as the easiest way to grab perforce changelist description (without rest of CL info):

p4 describe {CL NUMBER} | grep -oP '(?s).*(?=Affected files)'

Note, you can play with the <= and >= to include or not include, the starting/ending phrases in the output.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜