开发者

Getting rid of \x## in strings (Python)

I need to extract a description from a file, which looks like this: "TES4!\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00HEDR\x0c\x00\xd7\xa3p?h\x03\x00\x00\x00\x08\x00\xffCNAM\t\x00Martigen\x00SNAM\xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00MAST\r\x00Fallout3.esm\x00DATA\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00MAST\x16\x00Mart's Mutant Mod.esm\x00DATA\x08"

I've laready figured out how to get the part I need, but there's still some unwanted data in there that I don't know how to get rid of: \xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPC开发者_如何学运维s, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00

should become: Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\

Basically, I need a way to get rid of the \x## stuff (which if left in there will end up as weird characters when displayed in the GUI), but I haven't managed to get to successfully remove them.

[In case you were wondering, it's .esp files for FO3 I'm messing around with.]


you could try:

import string

cleaneddata = ''.join(c for c in data if c in string.printable)

This assumes that you already have data in a string.

Here's how it works for me:

>>> s = """TES4!\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00HEDR\x0c\x00\xd7\xa3p?h\x03\x00\x00\x00\x08\x00\xffCNAM\t\x00Martigen\x00SNAM\xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00MAST\r\x00Fallout3.esm\x00DATA\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00MAST\x16\x00Mart's Mutant Mod.esm\x00DATA\x08"""
>>> print ''.join(c for c in s if c in string.printable)TES4!HEDR
         p?hCNAM    MartigenSNAMMart's Mutant Mod - RC4

Diverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.

Fallout3.esmDATAMASTMart's Mutant Mod.esmDATA
>>> 

Not ideal as you can see but that might at least be a good first step.


First thing we do is pull up some docs. If we take a look at the bottom it shows how the SNAM subrecord should be handled. So we use struct to read the length, then we grab that many bytes (I'm guessing that you forgot to open the file in binary mode, since the count is off in your example) from the string, null-terminated. And then there's nothing left to do, since we have what we came for.


If you are up to the point of

\xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00

you can do the following to get rid of the last unwanted \x## by doing:

exp = re.compile(r"\\x[\w]")
newStr = [s for s in str.split("\\x00") if not re.search(exp, s)]
newStr = "".join(newStr)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜