Parsing url inside string
How would I go about matching the
following string format ( everything after the equal sign to the end of the .html
http%3A%2F%2Fwww.mydomains.com.com%2FSA100.html
i开发者_StackOverflow社区nside the string below:
http://www.tticker.com/me0439-119?url=http%3A%2F%2Fwww.mydomains.com.com%2FSA100.html%3Fc%2acn%2CSA400
(?<==).*?\.html
test with grep
kent$ echo "http://www.tticker.com/me0439-119?url=http%3A%2F%2Fwww.mydomains.com.com%2FSA100.html%3Fc%2acn%2CSA400"|grep -Po "(?<==).*?\.html"
http%3A%2F%2Fwww.mydomains.com.com%2FSA100.html
Simplest I could come up with was:
/url=(http.*\.html)/
Use the capture group for your URL.
In perl:
#!/usr/bin/perl -w
use URI;
my $uri = URI->new("http://www.tticker.com/me0439-119?url=http%3A%2F%2Fwww.mydomains.com.com%2FSA100.html%3Fc%2acn%2CSA400"); # create URI object
my %params = $uri->query_form(); # get all params
my $param_url = $params{url};
my $uri2 = URI->new($param_url); # create new URI object from param URL
$uri2->query(undef); # strip parameters
print $uri2->as_string();
gives:
http://www.mydomains.com.com/SA100.html
精彩评论