How to extract text between two words in unix?
I
am using basic sed expression :-sed -n "am/,/sed/p"
to get the text between "am" and "sed" which will output "am \n using \n basic \n sed". But my real problem is if the string would be :-
I
am using basic grep expression.I applied the above sed in this sentence then it gave "am \n using \n basic \n grep \n expression" which it should not give it. How to discard the o开发者_如何学Cutput if there would be no matching?
Any suggestions?
The command in the question (sed -n "/am/,/sed/p"
, note the added slash) means:
- Find a line containing the string
am
- and print (
p
) until a line containingsed
occurs
Therefore it prints:
I am using basic grep expression
because it contains am
. If you would add some more lines they will be printed, too, until a line containing sed
occurs.
E.g.:
echo -e 'I am using basic grep expression.\nOne more line\nOne with sed\nOne without' | sed -n "/am/,/sed/p"
results in:
I am using basic grep expression.
One more line
One with sed
I think - what you want to do is something like that:
sed -n "s/.*\(am.*sed\).*/\1/p"
Example:
echo 'I am using basic grep expression.' | sed -n "s/.*\(am.*sed\).*/\1/p"
echo 'I am using basic sed expression.' | sed -n "s/.*\(am.*sed\).*/\1/p"
sed -n "s/.*\(am.*sed\).*/\1/p"
You have to use slightly different sed command like:
sed -n '/am/{:a; /am/x; $!N; /sed/!{$!ba;}; /sed/{s/\n/ /gp;}}' file
To print ONLY lines that contain text am
and sed
spanned across multiple lines.
When Using SED this can work but it's quite an overwhelming syntax... if you need to crop part of a multi-line (\n) text, you might want to try a simpler way using grep:
cat multi_line.txt | grep -oP '(?s)(?<=START phrase).*(?=END phrase)'
For example, I find this as the easiest way to grab perforce changelist description (without rest of CL info):
p4 describe {CL NUMBER} | grep -oP '(?s).*(?=Affected files)'
Note, you can play with the <= and >= to include or not include, the starting/ending phrases in the output.
精彩评论