开发者

url parameters regex

I've created my own newsletter module and come across one (big) problem. The system formats all urls with additional parameters to keep track of the clicks in google analytics.

e.g. A url like this

http://www.domain.com

be开发者_StackOverflowcomes like this

http://www.domain.com/&utm_source=newsletter&utm_medium=e-mail&utm_campaign=test

and a url like this

http://www.domain.com/?page=1

becomes like this

http://www.domain.com/?page=1&utm_source=newsletter&utm_medium=e-mail&utm_campaign=test

The first example is bogus. I know the first ampersand has to be replaced by an ampersand and that's where the problem occurs. I'm using this pattern to extract url's

$pattern = array('#[a-zA-Z]+://([-]*[.]?[a-zA-Z0-9_/-?&%\{\}])*#');
$replace = array('\\0&utm_source=newsletter&utm_medium=e-mail&utm_campaign=test');
$body = preg_replace($pattern,$replace,$body);

Can anybody help me with a correct and working regex, so the first url parameter always contains a questionmark in stead of an ampersand?


just use

if(strpos($string,'?') !== false)
//add with ampersand
else
//add with question mark


Not regex, but it would work. All it does is check for a ? and if it isn't found, change the first & to a question mark.:

$url = (substr_count($url, '?')>0) ? $url : str_replace('&', '?', $url, 1);


A very simple approach would be to look for a string like http://...& where the ... contains no ? question mark or other delimiters:

= preg_replace('#(http://[^\s"\'<>?&]+)&#', '$1?', $src);

But it's probably best if you use a restricted instead of a negated character class:

$src = preg_replace('#(http://[\w/.]+)&#', '$1?', $src);


This solution fixes all urls which have a query beginning with a & (and are missing the ?):

$re = '%([a-zA-Z]+://[^?&\s]+)&(utm_source=newsletter)%';
$body = preg_replace($re, '$1?$2', $body);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜