开发者

How can I rewrite URLs except those of a particular domain?

Can you please help me to make perl regexp to replace (http://.+) to http://www.my1.com/redir?$1 but do nothing for urls like http://www.my1.com/ or http://my1.com/

For instance I need to replace http://whole.url.site.com/foo.ht开发者_C百科m to http://www.my1.com/redir?http://whole.url.site.com/foo.htm http://www.google.com to http://www.my1.com/redir?http://www.google.com but leave http://www.my1.com/index.php untached.

Thanks a lot!


If you are doing this inside a Perl script, don't use regular expressions. It's a mess to read them in this case, and so far every regex answer is broken since it doesn't URI escape the stuff that you want to put into the query string.

Instead of trying to parse a URI yourself, let the time-tested URI module handle all the edge cases for you. The URI::Escape module helps you make the query string so you don't get zapped by odd characters in URLs:

#!perl

use URI;
use URI::Escape;

while( <DATA> )
    {
    chomp;

    my $url = URI->new( $_ );

    if( $url->host =~ /(^|\.)my1\.com$/ ) {
        print "$url\n";
        }
    else {
        my $query_string = uri_escape($url->as_string);
        print "http://www.my1.com/redir?$query_string\n";
        }
    }

__DATA__
http://whole.url.site.com/foo.htm
http://www.google.com
http://www.google.com/search?q=perl+uri
http://www.my1.com/index.php
http://my1.com/index.php
http://moremy1.com/index.php


s{http://www\.nop1\.com/}{http://www.my1.com/redir?http://www.nop1.com}g

Meets your requirements as stated.

If your requirements are a little bit different, you'll need to explain exactly what you want.

Also, I'm not sure what this has to do with negative lookahead.

EDIT: With the reformulated question, here we go:

s{^http://(?!(?:www\.)?my1\.com)(.+)}{http://www.my1.com/redir?$1}g

(tweaked it a little)


You may be wanting to capture the sitename of the URL, if so try this:

 s{http://www\.(.*?)\.com/}{http://www.my1.com/redir?http://www.$1.com}g


It's probably not a good idea but it can be done:

$foo='http://www.foo.com/';
$foo =~ s#^(http://(?!(?:www\.)?my1\.com/).+)$#http://www.my1.com/redir?$1#;
print $foo;

Result:

http://www.my1.com/redir?http://www.foo.com/

As Brian points out in a comment it won't work with URLs that don't end in '/'. I'm not sure if you want to rewrite that URL or not. As I said in my comment to your question, you really need to be more precise on what you are trying to do and why you need to use regular expressions for this task.


s|(http://www\.(?!my1\.)(.*)\.com)|http://www.my1.com/redir?$1|i;

This matches any www.*.com website that isn't www.my1.com and puts it in the redirect.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜