Why is BeautifulSoup modifying my self-closing elements?
This is the script I have:
import BeautifulSoup
if __name__ == "__main__":
data = """
<root>
<obj id="3"/>
<obj id="5"/>
<obj id="3"/>
</root>
"""
soup = BeautifulSoup.BeautifulS开发者_运维技巧toneSoup(data)
print soup
When ran, this prints:
<root>
<obj id="3"></obj>
<obj id="5"></obj>
<obj id="3"></obj>
</root>
I'd like it to keep the same structure. How can I do that?
From the Beautiful Soup documentation:
The most common shortcoming of
BeautifulStoneSoup
is that it doesn't know about self-closing tags. HTML has a fixed set of self-closing tags, but with XML it depends on what the DTD says. You can tellBeautifulStoneSoup
that certain tags are self-closing by passing in their names as theselfClosingTags
argument to the constructor
精彩评论