开发者

Extracting multiple parts of a string using bash

I have a caret delimited (key=value) input and would like to extract multiple tokens of interest from it.

For example: Given the following input

$ echo -e "1=A00^35=D^150=1^33=1\n1=B000^35=D^150=2^33=2"
1=A00^35=D^22=101^150=1^33=1
1=B000^35=D^22=101^150=2^33=2    

I would like the following output

35=D^150=1^
35=D^150=2^

I have tried the following

$ echo -e "1=A00^35=D^150=1^33=1\n1=B000^35=D^150=2^33=2"|egrep -o "35=[^/^]*\^|150=[^/^]*\^"
35=D^
150=1^
35=D^
150=2^

My problem is that egrep returns each match on a separate line. Is it possible to get one line of output for one line of input? Please note that due to the constraints of the larger script, I cannot simply do a blind replace of all the \n characters in the output.

Thank you for any suggestions.This script is for bash 3.2.25. Any egrep alternatives are welcome. Please note that the tokens of interest (35 and 150) may change and I am already generating the egrep pattern in the script. Hence a one liner (if possible) would be great 开发者_如何学运维


You have two options. Option 1 is to change the "white space character" and use set --:

OFS=$IFS
IFS="^ "
set -- 1=A00^35=D^150=1^33=1  # No quotes here!!
IFS="$OFS"

Now you have your values in $1, $2, etc.

Or you can use an array:

tmp=$(echo "1=A00^35=D^150=1^33=1" | sed -e 's:\([0-9]\+\)=: [\1]=:g' -e 's:\^ : :g')
eval value=($tmp)
echo "35=${value[35]}^150=${value[150]}"


To get rid of the newline, you can just echo it again:

$ echo $(echo "1=A00^35=D^150=1^33=1"|egrep -o "35=[^/^]*\^|150=[^/^]*\^")
35=D^ 150=1^

If that's not satisfactory (I think it may give you one line for the whole input file), you can use awk:

pax> echo '
1=A00^35=D^150=1^33=1
1=a00^35=d^157=11^33=11
' | awk -vLIST=35,150 -F^ ' {
        sep = "";
        split (LIST, srch, ",");
        for (i = 1; i <= NF; i++) {
            for (idx in srch) {
                split ($i, arr, "=");
                if (arr[1] == srch[idx]) {
                    printf sep "" arr[1] "=" arr[2];
                    sep = "^";
                }
            }
        }
        if (sep != "") {
            print sep;
        }
    }'
35=D^150=1^
35=d^

 

pax> echo '
1=A00^35=D^150=1^33=1
1=a00^35=d^157=11^33=11
' | awk -vLIST=1,33 -F^ ' {
        sep = "";
        split (LIST, srch, ",");
        for (i = 1; i <= NF; i++) {
            for (idx in srch) {
                split ($i, arr, "=");
                if (arr[1] == srch[idx]) {
                    printf sep "" arr[1] "=" arr[2];
                    sep = "^";
                }
            }
        }
        if (sep != "") {
            print sep;
        }
    }'
1=A00^33=1^
1=a00^33=11^

This one allows you to use a single awk script and all you need to do is to provide a comma-separated list of keys to print out.


And here's the one-liner version :-)

echo '1=A00^35=D^150=1^33=1
      1=a00^35=d^157=11^33=11
      ' | awk -vLST=1,33 -F^ '{s="";split(LST,k,",");for(i=1;i<=NF;i++){for(j in k){split($i,arr,"=");if(arr[1]==k[j]){printf s""arr[1]"="arr[2];s="^";}}}if(s!=""){print s;}}'


given a file 'in' containing your strings :

$ for i in $(cut -d^ -f2,3 < in);do echo $i^;done
35=D^150=1^
35=D^150=2^
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜