开发者

PHP Regex on URL - split into variables

I am trying to implement a php script which will run on every call to my site, look for a certain pattern of URL, then explode the URL and perform a redirect.

Basically I want to r开发者_开发问答un this on a new CMS to catch all incoming links from the old CMS, and redirect, based on mapping, say an article id stripped form the URL to the same article ID imported into the new CMS's DB.

I can do the implementation, the redirect etc, but I am lost on the regex.

I need to catch any occurrences of:

domain.com/content/view/*/34/ or domain.com/content/view/*/30/ (where * is a wildcard) and capture * and the 30 or 34 in a variable which I will then use in a DB query.

If the following is encountered:

domain.com/content/view/*/34/1/*/

I need to capture the first * and the second *.

Be very grateful for anyone who can give me a hand on this.


I'm not sure regular expressions are the way to go. I think it would probably be easier to use explode ('/' , $url) and check by looping over that array.

Here are the steps I would follow:

$url = parse_url($url, PHP_URL_PATH); 
$url = trim($url, '/'); 
$parts = explode ('/' , $url); 

Then you can check if

($parts[0]=='content' && $parts[1]=='view' && $parts[3]=='34')

You can also easily get the information you want with $parts[2].


It's actually very simple, a more flexible and straightforward approach is to explode() the url into an array called something like $segments, and then test on there. If you have a very small number of expected URLs, then this kind of approach is probably easier to maintain and to read.

I wouldn't recommend doing this in the htaccess file because of the performance overhead.


First, I would use the PHP function parse_url() to get the path, devoid of any protocol or hostname.

Once you have that the following code should get you the info you need.

<?php

$url = 'http://domain.com/content/view/*/34/'; // first example
$url = 'http://domain.com/content/view/*/34/1/*/'; // second example
$url_array = parse_url($url);

$path = $url_array['path'];

// Match the URL against regular expressions
if (preg_match('/content\/view\/([^\/]+)\/([0-9]+)\//i', $path, $matches)){        
        print_r($matches);
}

if (preg_match('/content\/view\/([^\/]+)\/([0-9]+)\/([0-9]+)\/([^\/]+)/i', $path, $matches)){        
        print_r($matches);
}

?>

([^/]+) matches any sequence of characters except a forward slash

([0-9]+) matches any sequence of numbers

Though you can probably write a single regular expression to match most URL variants, consider using multiple regular expressions to check for different types of URLs. Depending on how much traffic you get, the speed hit won't be all that terrible.

Also, I recommend reading Mastering Regular Expressions by O'reilly. A good knowledge of regular expressions will come in handy quite often.

http://www.regular-expressions.info/php.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜