Processing just HTML fragment and returning it
When I do the following with Nokogiri:
some_html = '<img src="bleh.jpg"/>test<br/>'
f = Nokogiri::HTML(some_html)
#do some processing
puts f
It will print the whole XHTML doc structure with the upper 开发者_JAVA技巧code in it.
How can I just print/return/get the html part which is in some_html
variable?
No.
f
will return:
"<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www
.w3.org/TR/REC-html40/loose.dtd\">\n<html><body>\n<img src=\"bleh.jpg\">test<br>\n
</body></html>\n"
I only want the inner/fragment part:
<img src=\"bleh.jpg\">test<br>
Instead of parsing using Nokogiri::HTML(...)
use Nokogiri::HTML::fragment(...)
:
asdf = Nokogiri::HTML::fragment('<img src="bleh.jpg">test<br>')
print asdf.to_html
# >> <img src="bleh.jpg">test<br>
What do you mean by the 'html' part?
Just do f.text()
to get the inner text.
精彩评论