RegEx to find % symbols in a string that don't form the start of a legal two-digit escape sequence?
I would like a regular expression to find the %
s in the source string that don't form the start of a valid two-hex-digit escaped character (defined as a %
followed by exactly two hexadecimal digits, upper or lower cas开发者_开发百科e) that can be used to replace only these %
symbols with %25
.
(The motivation is to make the best guess attempt to create legally escaped strings from strings of various origins that may be legally percent escaped and may not, and may even be a mixture of the two, without damaging the data intent if the original string was already correctly encoded, e.g. by blanket re-encoding).
Here's an example input string.
He%20has%20a%2050%%20chance%20of%20living%2C%20but%20there%27s%20only%20a%2025%%20chance%20of%20that.
This doesn't conform to any encoding standard because it is a mix of valid escaped characters eg. %20
and two loose percentage symbols. I'd like to convert those %
s to %25
s.
My progress so far is to identify a regex %[0-9a-z]{2}
that finds the % symbols that are legal but I can't work out how to modify it to find the ones that aren't legal.
%(?![0-9a-fA-F]{2})
Should do the trick. Use a look-ahead to find a %
NOT followed by a valid two-digit hexadecimal value then replace the found %
symbol with your %25
replacement.
(Hopefully this works with (presumably) NSRegularExpression, or whatever you're using)
%(?![a-fA-F0-9]{2})
That's a percent followed by a negative lookahead for two hex digits.
精彩评论