Nokogiri unescaped html

2023-03-23 10:49 问答作者：

I am parsing HTML text using nokogiri and making some changes to that HTML.

doc = Nokogiri::HTML.parse(html_code)

But i am using mustache with that html so the html contains mustache variables which are in enclosed in curly braces e.g.{{mustache_variable}}.

After tinkering with the nokogiri document, when i do

doc.to_html

These curly braces are escaped and i get something like %7B%7Bmustache_variable%7D%7D

But, not all of the content is escaped, e.g. if i have html as

<label> {{mustache_variable}} </label>

It returns, <label> {{mustache_variable}} </label>

But for html like, <img src='{{mustache_variable}}'>

It returns, <img src='%7B%7Bmustache_variable%7D%7D'>

So, i am currentl开发者_Go百科y doing a gsub to replace %7B and %7D with { and } respectively so mustache works.

So, is there a way i can get the exact html from nokogiri or a better solution ???

Probably you need cgi module

require 'cgi'
doc = Nokogiri::HTML.parse(html_code)
CGI.unescapeHTML(doc.to_html)

or you can use htmlentities lib.

And try to use doc.content instead of doc.to_html

I ran into this same problem and ended up using a regular expression to convert the escaped double braces:

html_doc.gsub(/%7B%7B(.+?)%7D%7D/, '{{\1}}')

To make this safer, I'd recommend prefixing each mustache variable with a namespace, just in case some of the HTML does have the escaped double brace pattern intentionally, e.g.

html_doc.gsub(/%7B%7Bnamespace(.+?)%7D%7D/, '{{namespace\1}}')

继续阅读：nokogiri ruby ruby-on-rails ruby-on-rails-3

Nokogiri unescaped html

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？