开发者

PHP base_convert for shortening URLs

I want to make my urls shorter, similar as tinyurl, or any other url shortning service. I have following type of links:

localhost/test/link.php?id=1000001
localhost/test/link.php?id=1000002

etc.

The ID in the a开发者_如何转开发bove links are auto-incrementing ID's of the rows from db. The above links are mapped like:

localhost/test/1000001
localhost/test/1000002

Now instead of using the above long IDs, I would like to shorten them. I have found that I can use base_convert() function. For example:

print base_convert(100000000, 10, 36);

//output will be "1njchs"

It looks pretty good, but i want to ask if there is any disadvantage(eg. slow performance or any other) of using this function or is there any better approach to do same thing (eg. make own function to generate random ID strings)?

Thanks.


The function base_convert is fast enough, but if you want to do better, just store the encoded string inside the database.


With base_convert() you can convert the string to a shorter code and then with intval() you create a ID to store item in database

My code snippet:-

$code = base_convert("long string", 10, 36);
$ID= intval($code ,36); 


Unfortunately, I was unsatisfied with the answers here and elsewhere as base_convert() and other floating point based conversion strategies lose an unacceptable amount of precision for cryptographic purposes. Furthermore, most of these implementations are incapable of dealing with numbers large enough for cryptographic application. The following provides two methods of base conversion that should be safe for large bases'. For example, converting a base256 (binary string) to base85 representation and back again.

Using GMP

You can use GMP to accomplish this at the cost of converting bin<->hex two unneeded times as well as being limited to base62.

<?php
// Not bits, bytes.
$data = openssl_random_pseudo_bytes(256);

$base62 = gmp_strval( gmp_init( bin2hex($data), 16), 62 );
$decoded = hex2bin( gmp_strval( gmp_init($base62, 62), 16 ));

var_dump( strcmp($decoded, $data) === 0 ); // true

Pure PHP

If you would like to move beyond base62 to base85 or a slight performance improvement, you will need something like the following.

<?php

/**
* Divide a large number represented as a binary string in the specified base
* and return the remainder.
* 
* @param string &$binary
* @param int $base
* @param int $start
*/
function divmod(&$binary, $base, $divisor, $start = 0)
{
    /** @var int $size */
    $size = strlen($binary);

    // Do long division from most to least significant byte, keep remainder.
    $remainder = 0;
    for ($i = $start; $i < $size; $i++) {
        // Get the byte value, 0-255 inclusive.
        $digit = ord($binary[$i]);

        // Shift the remainder left by base N bits, append the last byte.
        $temp = ($remainder * $base) + $digit;

        // Calculate the value for the current byte.
        $binary[$i] = chr($temp / $divisor);

        // Carry the remainder to the next byte.
        $remainder = $temp % $divisor;
    }

    return $remainder;
}

/**
* Produce a base62 encoded string from a large binary number.
* 
* @param string $binary
* return string
*/
function encodeBase62($binary)
{
    $charMap = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    $base = strlen($charMap);

    $size = strlen($binary);
    $start = $size - strlen(ltrim($binary, "\0"));

    $encoded = "";
    for ($i = $start; $i < $size; ) {
        // Do long division from most to least significant byte, keep remainder.
        $idx = divmod($binary, 256, $base, $i);

        $encoded = $charMap[$idx] . $encoded;

        if (ord($binary[$i]) == 0) {
            $i++; // Skip leading zeros produced by the long division.
        }
    }

    $encoded = str_pad($encoded, $start, "0", STR_PAD_LEFT);

    return $encoded;
}

/**
* Produce a large binary number from a base62 encoded string.
* 
* @param string $ascii
* return string
*/
function decodeBase62($ascii)
{
    $charMap = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    $base = strlen($charMap);

    $size = strlen($ascii);
    $start = $size - strlen(ltrim($ascii, "0"));

    // Convert the ascii representation to binary string.
    $binary = "";
    for ($i = $start; $i < $size; $i++) {
        $byte = strpos($charMap, $ascii[$i]);
        if ($byte === false) {
            throw new OutOfBoundsException("Invlaid encoding at offset '{$ascii[$i]}'");
        }

        $binary .= chr($byte);
    }

    $decode = "";
    for ($i = 0; $i < $size; ) {
        // Do long division from most to least significant byte, keep remainder.
        $idx = divmod($binary, $base, 256, $i);

        $decode = chr($idx) . $decode;

        if (ord($binary[$i]) == 0) {
            $i++; // Skip leading zeros produced by the long division.
        }
    }

    $decode = ltrim($decode, "\0");
    $decode = str_pad($decode, $start, "\0", STR_PAD_LEFT);

    return $decode;
}

// Not bits, bytes.
$data = openssl_random_pseudo_bytes(256);

$base62 = encodeBase62($data);
$decoded = decodeBase62($base62);

var_dump( strcmp($decoded, $data) === 0 ); // true
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜