Getting rid of \x## in strings (Python)
I need to extract a description from a file, which looks like this: "TES4!\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00HEDR\x0c\x00\xd7\xa3p?h\x03\x00\x00\x00\x08\x00\xffCNAM\t\x00Martigen\x00SNAM\xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00MAST\r\x00Fallout3.esm\x00DATA\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00MAST\x16\x00Mart's Mutant Mod.esm\x00DATA\x08"
I've laready figured out how to get the part I need, but there's still some unwanted data in there that I don't know how to get rid of: \xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPC开发者_如何学运维s, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00
should become: Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\
Basically, I need a way to get rid of the \x## stuff (which if left in there will end up as weird characters when displayed in the GUI), but I haven't managed to get to successfully remove them.
[In case you were wondering, it's .esp files for FO3 I'm messing around with.]
you could try:
import string
cleaneddata = ''.join(c for c in data if c in string.printable)
This assumes that you already have data
in a string.
Here's how it works for me:
>>> s = """TES4!\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00HEDR\x0c\x00\xd7\xa3p?h\x03\x00\x00\x00\x08\x00\xffCNAM\t\x00Martigen\x00SNAM\xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00MAST\r\x00Fallout3.esm\x00DATA\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00MAST\x16\x00Mart's Mutant Mod.esm\x00DATA\x08"""
>>> print ''.join(c for c in s if c in string.printable)TES4!HEDR
p?hCNAM MartigenSNAMMart's Mutant Mod - RC4
Diverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.
Fallout3.esmDATAMASTMart's Mutant Mod.esmDATA
>>>
Not ideal as you can see but that might at least be a good first step.
First thing we do is pull up some docs. If we take a look at the bottom it shows how the SNAM
subrecord should be handled. So we use struct
to read the length, then we grab that many bytes (I'm guessing that you forgot to open the file in binary mode, since the count is off in your example) from the string, null-terminated. And then there's nothing left to do, since we have what we came for.
If you are up to the point of
\xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00
you can do the following to get rid of the last unwanted \x## by doing:
exp = re.compile(r"\\x[\w]")
newStr = [s for s in str.split("\\x00") if not re.search(exp, s)]
newStr = "".join(newStr)
精彩评论