
Clean up URLs with PHP

I am coding a site and have key words in the URLs like this:


The Part that has the title "2010 Federal Spending" is not used for navigation; it is completely ignored by my site's navigation. My site just pays attention to the 'id', not the 's'. Again, the title is just there for SEO reasons.

Is there a PHP function to clean up this portion of the URL? For example, replace the '%20' with '-' or something similar?

You'll want to look into mod_rewrite in your .htaccess

Adding a rewrite rule in your .htaccess is simple. First, activate mod_rewrite by adding this line to your .htaccess:

RewriteEngine on
RewriteBase /

Then add your rule to redirect your pages:

RewriteRule ^([0-9]+)/([^/]+)$ /yourpage\.php?id=$1&s=$2

This will allow you to structure your urls like such:


Then, on yourpage.php:

echo $_GET['id']; // will equal 115 from the above example
echo $_GET['s']; // will equal 2010-federal-spending from the above example

Use urldecode($your_string) in case you'd like to decode URLs. Since space is not a valid URL character, maybe you should try to replace the spaces in the title before you even use it as an address.

$mytitle = "2010 Federal Spending";
$fixedtitle = str_replace(" ", "_", $mytitle);
echo $fixedtitle;

You could also remove other CHARS that might cause some problems such as "&"

$mytitle = "2010 Federal Spending";
$invchars = array(" ","@",":","/","&");
$fixedtitle = str_replace($invchars, "_", $mytitle);
echo $fixedtitle;


this is an encoded url , empty ' ' has been encoded into '%20' , you don't want to replace it as such instead decode it first

$url = urldecode('?s=2010%20Federal%20Spending&id=115')

now replace empty string with anything you like in the end do

$newUrl = str_replace(' ' ,'-',$url); 
echo urlencode($newUrl);

You can also use the function described here (in French):

     * Convert into filename by removing all accents and special characters. Useful for URL Rewriting.
     * @param $text
     * @return string
    public function ConvertIntoFilename($text)
        // Remove all accents.
        $convertedCharacters = array(
            'À' => 'A', 'Á' => 'A', 'Â' => 'A', 'Ã' => 'A', 'Ä' => 'A', 'Å' => 'A',
            'à' => 'a', 'á' => 'a', 'â' => 'a', 'ã' => 'a', 'ä' => 'a', 'å' => 'a',
            'Ò' => 'O', 'Ó' => 'O', 'Ô' => 'O', 'Õ' => 'O', 'Ö' => 'O', 'Ø' => 'O',
            'ò' => 'o', 'ó' => 'o', 'ô' => 'o', 'õ' => 'o', 'ö' => 'o', 'ø' => 'o',
            'È' => 'E', 'É' => 'E', 'Ê' => 'E', 'Ë' => 'E',
            'é' => 'e', 'è' => 'e', 'ê' => 'e', 'ë' => 'e',
            'Ç' => 'C', 'ç' => 'c',
            'Ì' => 'I', 'Í' => 'I', 'Î' => 'I', 'Ï' => 'I',
            'ì' => 'i', 'í' => 'i', 'î' => 'i', 'ï' => 'i',
            'Ù' => 'U', 'Ú' => 'U', 'Û' => 'U', 'Ü' => 'U',
            'ù' => 'u', 'ú' => 'u', 'û' => 'u', 'ü' => 'u',
            'ÿ' => 'y',
            'Ñ' => 'N', 'ñ' => 'n'

        $text = strtr($text, $convertedCharacters);

        // Put the text in lowercase.
        $text = mb_strtolower($text, 'utf-8');

        // Remove all special characters.
        $text = preg_replace('#[^a-z0-9-]#', '-', $text);

        // Remove two consecutive dashes (that's not very pretty).
        $text = preg_replace('/--/U', '-', $text);

        // Remove words containing less than 2 characters (non significant for the meaning)
        $return = array();
        $text = explode('-', $text);

        foreach($text as $word)
            if(mb_strlen($word, 'utf-8') <= 2)   continue;
            $return[] = $word;

        return implode('-', $return);

Yet, it will still require that you modify your .htaccess, like mentionned by AlienWebGuy. :)





验证码 换一张
取 消

