etree.findall: 'OR'-lookup?
I want to find all stylesheet definitions in a XHTML file with lxml.etree.findall
. This could be as simple as
elems = tree.findall('link[@rel="stylesheet"]') + tree.findall('style')
But the problem with CSS style definitions is that the order matters, e.g.
<link rel="stylesheet" type="text/css" href="/media/css/first.css" />
<style>body:{font-size: 10px;}</style>
<link rel="stylesheet" type="text/css" href="/media/css/second.css" />
if the contents of the style
tag is applied after the rules in the two link
tags, the result may be completely different from the one where the rules are applied in order of definition.
So, how would I do a lookup that inlcudes both link[@rel="stylesheet"]
and 开发者_如何学JAVAstyle
?
Possible using XPATH:
data = """<link rel="stylesheet" type="text/css" href="/media/css/first.css" />
<style>body:{font-size: 10px;}</style>
<link rel="stylesheet" type="text/css" href="/media/css/second.css" />
"""
from lxml import etree
h = etree.HTML(data)
h.xpath('//link[@rel="stylesheet"]|//style')
[<Element link at 97a007c>,
<Element style at 97a002c>,
<Element link at 97a0054>]
精彩评论