开发者

"tailing" a binary file based on string location using bash?

I've got a bunch of binary files, each containing an embedded string near the end of the file but at different places (only occurs once in each file). I need to extract the part of the file starting at the location of the string til开发者_运维知识库l the end of the file and dump it into a new file.

eg. If the file's contents is "AWREDEDEDEXXXERESSDSDS" and the string of interest is "XXX", then the part of the file I need is "XXXERESSDSDS".

What's the easiest way to do this in bash?


In PERL, there is a variable built in that specifically refers to the part of the string after the matched regular expression. That would be the method I would use. It is not just Bash and utilities, but PERL is so commonly installed that you should be OK.


Following is a small hack shell solution that is not very performant. But it works.

Write the script file tail.sh as follows:

#!/bin/sh
dd bs=1 if=$1 of=$2 skip=`grep --binary-files=text -m1 -b -o $3 $1 | cut -d ':' -f 1 | head -1`

Call tail.sh INPUTNAME OUTPUTNAME PATTERN

p.s.: sorry forgot one option to grep in first post


Would strings and grep do you want?

e.g.

strings -n 3 myfilename | grep XXX


 strings -n3 file_binary | awk '/XXX/{gsub(/.*XXX/,"");print}'


I came up with this solution:

ls -1 *.bin | xargs strings -n4 --radix=d -f | grep "string" | awk '{sub(/:/, ""); print $2 " " $1 " " $1".";}' | xargs -l1 split -b && rm *.aa

ls -1 *.bin Print only the filenames with the extension "bin" in a list format

xargs strings -n4 --radix=d -f List all the strings in the file and their positions and include the filename in the output

grep "string" Print lines containing "string" (it only occurs once in each file)

awk '{sub(/:/, ""); print $2 " " $1 " " $1".";}' Remove the colon after the filename added by strings, and print the position of the string, the filename, and the filename with a period (this line is used as the arguments for the split command

xargs -l1 split -b Execute the split command for each line using the output of awk as the rest of the arguments

rm *.aa Delete the first parts of the split files. "aa" is the default suffix for the part of the split files.

There are probably better/faster/safer ways of doing this but it's fine for my purposes.


Try this:

grep -ao string.* filename

Since you have binary data, you might want to redirect the output to a file.

grep -ao string.* filename > binary.out

Or pipe it through hexdump or similar for testing:

grep -ao string.* filename | hd
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜