开发者

Why doesn't this regular expression match?

I have a Perl script from Squid web proxy:

#!/usr/bin/perl
$|=1;
while (<>) {
    @X = split;
    $x = $X[0];
    $_ = $X[1];
    if (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?\&(itag=22).*?\&(id=[a-zA-Z0-9]*)/) {
        print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $2 . "&" . $3 . "\n";
    # youtube Normal screen always HD itag 35, Normal screen never HD itag 34, itag=18 <--normal?
    } elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?\&(itag=[0-9]*).*?\&(id=[a-zA-Z0-9]*)/) {
        print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $2 . "&" . $3 . "\n";

    } else {
        print $x . $_ . "\n";
    }
}

that I got from http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube. I've tested input such as

http://v24.lscache6.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor%2Coc%3AU0hPRVFUTl9FSkNOOV9JTlJF&fexp=905230%2C901013&algorithm=throttle-factor&itag=34&ipbits=0&burst=40&sver=3&signature=2A5088FD4F64CF9D58A5B798E14452D71B51BAE8.2EABF06D09C8C81650266C5464CF1D0B4D6C25CC&expire=1300190400&key=yt1&ip=0.0.0.0&fac开发者_C百科tor=1.25&id=e838f2cd3549e3cb

in RegexBuddy with Perl syntax, and I see it match the second regular expression in above script. But it didn't match when I ran the script. I'm not a Perl programmer, so where was I wrong?


I would recommend to divide the regex in separate variabales then modify one of them at a time. This way you can find the problem yourself.

I am not sure if someone will bother to debug your programm. Example:

 my $part1 =qr/http:\/\/([0-9.]{4}/;
 my $part2 = qr/.*\.youtube\.com/;
 #etc ... then
 if (m/^part1|$part2....


Why not use the URI parser module? Here is a simple example using one. That way you can grab the host out by a simple $uri->host() and check it against your list of hosts. You should also be able to get the itag and id fields too regardless of what order they're in, or if there are other attributes as well, which could break a regex.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜