开发者

Faster and if possible less memory cost method generate an alphanumeric code?

I'm trying to make a really simple short URL redirect without a database.

Here is what I have so far:

<?php
$name = $_GET['file'];
$name = preg_replace("/[^A-Za-z0-9]/", '', $name);
$file = 'data/' . $name;

// File found
if (is_file($file))
{
    // Read the first line, we don't use file_get_contents as the data folder is protected and must be read internally
    $f = fopen($file, 'r');
    $data = fgets($f);
    fclose($f);

    // Redirect to the real URL
    header("Location: $data");
}
else
{
    // What a shame the URL does not exist
    header("Location: http://www.mydomain.com/");
}

exit();
?>
  • I would like to know what would be the faster and if possible less memory cost method to generate an alphanumer开发者_StackOverflowic code from 6 to 8 characters that does not collide with the existent ones in the data folder ?


Do you have a requirement also that for any given url it must be possible to look up the short code for it? A system of just counting up numbers would do the trick to just generate unique filenames but of course this is not a repeatable method so if the same url went in multiple times it would come out with different keys each time.

If this is acceptable then I'd just suggest a counter, possibly in base 36 (case insensitive alphanumeric) or similar to give you maximum key space size. You could have one file that contains the current count (could also be stored in memory but would need to be reloaded on restart) and then of course you have to be careful about multi-thread access both reading the next value at the same time.

If you need a given url to be consistently given the same id then you could have a second directory storing files named after the url (escaped as appropriate) containing the key that you generated the first time for them. When generating new keys you can look up in this file directory for if the url already has a key and return that if it is there.

As you can see this is basically crudely replicating the way the database will work with the two directories basically being indexes on a table of url and key.

The only other way I can think to do it would be to have some function that is one-to-one that will be guaranteed for the input you are looking at to generate a string under a certain length. I can't think where you would find such a function. Compression algorithms are the closest thing but they of course generate output that is unlikely to suit your need (since the binary it compresses to will probably as big as the original string once it has been base64 encoded or similar)

A hashing function as suggested by fardjad will probably be alright but there is no way to go from a hashed value back to the url and there is no guarantee that two inputs will be unique (though the chances of them not being so are extremely small).

I suspect fardjad's solution will be as good as you need in practice but it depends how robust this needs to be.

I should note finally that I have never written or looked much into the shorter url services so none of what I say is expert advice, just thoughts on how I would do it if it were me having done no research. :)


As I see it you want to generate an alpha-numeric code for each new file being added to data folder and the content of those files is where you want to redirect to.

The method you're using looks good to me. Just a few suggestions:

You could use the $name MD5 hash to name files in data folder so you won't need to remove non-alphabetic characters in this line:

preg_replace("/[^A-Za-z0-9]/", '', $name);

just calculate the hash to get file names instead:

file_name = md5($name);

also the file names will be unique this way.

Another suggestion is to use a XML file to store redirects if you really don't want to use databases. It can be done easily by using SimpleXML(Take a look at the examples).


If I'm correct your pasted code is the url redirect logic, not file name generation, right? I suggest you use a single threaded process (for example, a node.js server) to generate and maintain a max_number value.

Each time you need a new file name, just send a request to that server. The server increments the max_number and returns its current value. Then in your PHP code, convert this integer to a string consisting of alphanumeric characters. The PHP gmp_strval function can do this job by converting a number to base-62 form.

This way is safe since it guarantees absolute uniqueness in a simple way. And I guess it's a common approach which was used by many public url shortner services as I noticed that their strings increment naturally.

Of course the gmp_strval function can be implemented in your own code easily if it's not available on your machine. Some examples here: How to convert an integer in any base to a string?

Shorter is better for this kind of service. But if you do want 6-8 characters, just start with base-62 string "100000" (916132832 in decimal form).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜