开发者

Ruby Sanitize Code ... why is & sanitized

I currently use the following code to sanitize a string before storing them:

ERB::Util::h(string)

My problem occurs when the string has been sanitized already like this:

string = "Watching baseball `&` football"

The sanitized string will look like:

sanitized_string = "Watching baseball `&` football"
开发者_运维问答

Can I sanitize by just turning < into &lt; and > into &gt; via substitution?


Unescape first, then escape again:

require 'cgi'
string = "Watching baseball &amp; football"

CGI.escapeHTML(CGI.unescapeHTML(string))

=> "Watching baseball &amp; football"


A fast approach based on this snippet from Erubis.

ESCAPE_TABLE = { '<'=>'&lt;', '>'=>'&gt;' }
def custom_h(value)
   value.to_s.gsub(/[<>]/) { |s| ESCAPE_TABLE[s] }
end


Yes you can, or taking it further you can just delete entire tags with a basic regex like this:

mystring.gsub( /<(.|\n)*?>/, '' )


You could write your own sanitizer, but there are lots of corner cases and tricky edges in sanitization.

A better approach might be to unencode your string before sanitizing it - does h() have an inverse you could put your strings through first?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜