Substitute one character in a line only if it occurs before another specific character
I have an instance where certain URLs are getting malformed with equal signs and I need to be able to 开发者_高级运维remove them.
An example broken URL:
http://www.go=ogle.com/search?ie=UTF
to be corrected to this:
http://www.google.com/search?ie=UTF
It can not simply replace the first occurrence of =
because not all URLs are broken like this.
Is there a sed/awk or other regex way of deleting all instances of =
only if they occur before a question mark?
use
sed -e 's~\(http://[^?]*\)=\([^?]*\)~\1\2~'
which basically says to strip one = character from within anything starting with http://
and then having anything but a ?
.
edit looking at it again, this is a lot cleaner:
sed -e 's~\(http://[^/?]*\)=~\1~'
精彩评论