开发者

Extracting text in between strings

How do I extract text in between strings with very specific pattern from a file full of these lines? Ex:

input:a_log.gz:make=BMW&ye开发者_StackOverflow社区ar=2000&owner=Peter

I want to essentially capture the part make=BMW&year=2000. I know for a fact that the line can start out as "input:(any number of characters).gz:" and end with "owner=Peter"


Use the regex: input:.*?\.gz:(.*?)&?owner=Peter. The capture will contain the things between the second colon and "owner=Peter", trimming the ampersand.


Give this a try:

sed -n 's/.*:\([^&]*&[^&]*\)&.*/\1/p' file

This will extract everything between the second colon and the second ampersand regardless of what's before and after (if there are more colons or ampersands it may not work properly).


you can use the shell(bash/ksh)

$ s="input:a_log.gz:make=BMW&year=2000&owner=Peter"
$ s=${s##*gz:}
$ echo ${s%%owner=Peter*}
make=BMW&year=2000&

if you want sed

$ echo ${s} | sed 's/input.*gz://;s/owner=Peter//'
make=BMW&year=2000&


>echo "input:a_log.gz:make=BMW&year=2000&owner=Peter"|sed -e "s/input:.*.gz://g" -e "s/&owner.*//g"
make=BMW&year=2000


I didn't see an answer using awk:

awk '{ match($0, /input:.*\.gz:/);
       m = RSTART+RLENGTH;
       n = index($0, "&owner=Peter") - m;
       print substr($0,m,n)
     }'

The method is sort of a mix between the sh version (substring by parameter expansions) and the sed (regular expressions) versions. This is because awk RE's lack backreferences.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜