Can this be done faster (read file, substitute [sed], write new file)
I use this piece of code in my bash script to read a file containing several hex strings, do some substitution and then write it to a new file. It takes about 30 minutes for about 300 Mb.
I'm wondering if this can be done faster ?sed 's,[0-9A-Z]\{2\},\\\\x&,g' ${in_file} | while read line; do
printf "%b" ${line} >> ${out_file}
printf '\000\000' >> ${out_file}
done
Update:
I did some testing and got the following results:
The winner is:
sed 's,[0-9A-Z]\{2\},\\\\x&,g' ${in_file} | while read line; do
printf "%b" ${line} >> ${out_file}
printf '\000\000' >> ${out_file}
done
real 44m27.021s
user 29m17.640ssys 15m1.070ssed 's,[0-9A-Z]\{2\},\\\\x&,g' ${in_file} | while开发者_StackOverflow read line; do
printf '%b\000\000' ${line}
done >> ${out_file}
real 18m50.288s
user 8m46.400ssys 10m10.170sexport LANG=C
sed 's/$/0000/' ${in_file} | xxd -r -ps >> ${out_file}
real 0m31.528s
user 0m1.850ssys 0m29.450sYou need xxd command that comes with Vim.
export LANG=C
sed 's/$/0000/' ${in_file} | xxd -r -ps > ${out_file}
This is slow because of the loop in bash. If you can get sed/awk/perl/etc to do the loop, it will be much faster. I can't see how you can do it in sed or awk though. It's probably pretty easy for perl, but I dont know enough perl to answer that for you.
At the very least, you should be able to save a little time by refactoring what you have to:
sed 's,[0-9A-Z]\{2\},\\\\x&,g' ${in_file} | while read line; do
printf '%b\000\000' ${line}
done >> ${out_file}
At least this way, you're running printf once per iteration and opening/closing ${out_file} once only.
Switch to a full programming language? Here's a Ruby one-liner:
ruby -ne 'print "#{$_.chomp.gsub(/[0-9A-F]{2}/) { |s| s.to_i(16).chr }}\x00\x00"'
if you have Python and assuming data is simple
$ cat file
99
AB
script:
o=open("outfile","w")
for line in open("file"):
s=chr(int(line.rstrip(),16))+chr(000)+chr(000)
o.write(s)
o.close()
精彩评论