Logical thing to do about checking for unique id?
I have a randomly generated link id which is 8 digits long with 50 values per digit, which means 4x10^13 possible combinations (i think). I have about ten-thousands of queries per day.
My question is, should I check 4 tables each query for duplicates, or skip it? or make it 10 digits so that it definitely won't be a match?
edit:
my (probably copied) generator
// START Generates Random String
function genRandString($len=8){
$base='ABCDEFGHKLMNPQRSTWXYZabcdefghjkmnpqrstwxyz23456789';
$max=strlen($base)-1;
$randstring1 ='';
mt_srand((double)microtime()*1000000);
while (strlen($randstring1)<$len+1)
$randstring1.=$base{mt_rand(0,$max)};
return $randstring开发者_C百科1;
}
// END Generates Random String
It depends on the quality of the pseudorandom number generator. You might have insufficient entropy so you're more likely to get a collision that you realize.
Is there a reason you aren't using UUID()
? It seems like the best solution that is designed for this purpose.
Anyway, I don't recommend checking for duplicates before you insert. This is subject to race conditions, that is, someone could insert the duplicate value in the moment after you check, but before you insert. So you'd have to handle duplicate key violation exceptions anyway. Better to just try the insert (without checking first) and handle exceptions as needed.
Re your comments and your algorithm: I would not use that hashing scheme. You have fewer than 24 bits of information in four digits of 50 distinct values. So your chance of collision is pretty significant once you have a few thousand rows in your database.
How about this solution: Use a monotonically increasing primary key value, e.g AUTO_INCREMENT. To change this number into an alphanumeric string, use base_convert()
:
$id = 12345678;
$str = base_convert($id, 10, 36);
echo "$str\n";
The result is 7clzi
.
If you are worried about confusion in letters like 1, l, i, 0, o, you can do some custom substitutions:
$from = array('1', 'l', 'i', '0', 'o');
$to = array('A', 'B', 'C', 'D', 'E');
$str = str_replace($from, $to, $str);
Now the value 12345678
is converted to 7cBzC
. When someone requests a page by this code, do the conversion in reverse:
$code = str_replace($to, $from, $code);
$id = base_convert($code, 36, 10);
prefix it with a timestamp and skip checking it
精彩评论