Ruby Sanitize Code ... why is & sanitized
I currently use the following code to sanitize a string before storing them:
ERB::Util::h(string)
My problem occurs when the string has been sanitized already like this:
string = "Watching baseball `&` football"
The sanitized string will look like:
sanitized_string = "Watching baseball `&` football"
开发者_运维问答Can I sanitize by just turning < into <
and > into >
via substitution?
Unescape first, then escape again:
require 'cgi'
string = "Watching baseball & football"
CGI.escapeHTML(CGI.unescapeHTML(string))
=> "Watching baseball & football"
A fast approach based on this snippet from Erubis.
ESCAPE_TABLE = { '<'=>'<', '>'=>'>' }
def custom_h(value)
value.to_s.gsub(/[<>]/) { |s| ESCAPE_TABLE[s] }
end
Yes you can, or taking it further you can just delete entire tags with a basic regex like this:
mystring.gsub( /<(.|\n)*?>/, '' )
You could write your own sanitizer, but there are lots of corner cases and tricky edges in sanitization.
A better approach might be to unencode your string before sanitizing it - does h() have an inverse you could put your strings through first?
精彩评论