"tailing" a binary file based on string location using bash?
I've got a bunch of binary files, each containing an embedded string near the end of the file but at different places (only occurs once in each file). I need to extract the part of the file starting at the location of the string til开发者_运维知识库l the end of the file and dump it into a new file.
eg. If the file's contents is "AWREDEDEDEXXXERESSDSDS" and the string of interest is "XXX", then the part of the file I need is "XXXERESSDSDS".
What's the easiest way to do this in bash?
In PERL, there is a variable built in that specifically refers to the part of the string after the matched regular expression. That would be the method I would use. It is not just Bash and utilities, but PERL is so commonly installed that you should be OK.
Following is a small hack shell solution that is not very performant. But it works.
Write the script file tail.sh
as follows:
#!/bin/sh
dd bs=1 if=$1 of=$2 skip=`grep --binary-files=text -m1 -b -o $3 $1 | cut -d ':' -f 1 | head -1`
Call tail.sh INPUTNAME OUTPUTNAME PATTERN
p.s.: sorry forgot one option to grep in first post
Would strings
and grep
do you want?
e.g.
strings -n 3 myfilename | grep XXX
strings -n3 file_binary | awk '/XXX/{gsub(/.*XXX/,"");print}'
I came up with this solution:
ls -1 *.bin | xargs strings -n4 --radix=d -f | grep "string" | awk '{sub(/:/, ""); print $2 " " $1 " " $1".";}' | xargs -l1 split -b && rm *.aa
ls -1 *.bin Print only the filenames with the extension "bin" in a list format
xargs strings -n4 --radix=d -f List all the strings in the file and their positions and include the filename in the output
grep "string" Print lines containing "string" (it only occurs once in each file)
awk '{sub(/:/, ""); print $2 " " $1 " " $1".";}' Remove the colon after the filename added by strings, and print the position of the string, the filename, and the filename with a period (this line is used as the arguments for the split command
xargs -l1 split -b Execute the split command for each line using the output of awk as the rest of the arguments
rm *.aa Delete the first parts of the split files. "aa" is the default suffix for the part of the split files.
There are probably better/faster/safer ways of doing this but it's fine for my purposes.
Try this:
grep -ao string.* filename
Since you have binary data, you might want to redirect the output to a file.
grep -ao string.* filename > binary.out
Or pipe it through hexdump
or similar for testing:
grep -ao string.* filename | hd
精彩评论