python re module to replace the binary data inside a text file?
I know mixing text and binary is awful, but I have to do this.
I want to replace the binary content, which is around with "Content-Type: image" and "----", by string "XXXXXXXX"
So the code for test is:
# coding=utf-8
import re
raw_data = open('r_img.txt').read()
#data = re.sub(r"Content-Type: image.*?-----","Content-Type: imageXXXXXXX-----", raw_data, re.S)
data = re.sub(r"Content-Type: image[^-]*-----","Content-Type: imageXXXXXXX-----", raw_data, re.S)
print data
And the file r_img.txt would be:
Content-Disposition: form-data; name="commodity_pic1"; filename="C:\Documents and Settings\tim\My Documents\My Pictures\Pic\222A8888.jpg"
Content-Type: image/pjpeg
EEE开发者_开发问答? JFIF H H EEE C
EEE C
EEEWhfEEE[e?EEEEEEqEEEEEEEEEEEEEEEZIOEEE(r5?-iEEEEEEEEEEEEEEE?EEE?EEEEEE
-----------------------------7db27132d0198
I had try string.replace() and re.sub, but I can't still find the answer.
This works for me:
data = re.sub(r"Content-Type: image.*-----","Content-Type: imageXXXXXXX-----",
raw_data, 0, re.DOTALL)
Essentially it matches in a greedy way all characters between Content-Type: image
and -----
. The 0 means "match all occurrences of this pattern". Probably this is superfluous for you, but you can't skip it as you also wanted to use the flag re.DOTALL
that modify the meaning of "any characters" to also include newlines.
HTH!
精彩评论