开发者

TSQL Generate 5 character length string, all digits [0-9] that doesn't already exist in database

Wh开发者_JAVA百科at's the best way to do this?

I need to generate a 5 digit length string where all the characters are numeric. However, I need to be able to do this 'x' amount of times (user variable) and store this random strings in a database. Furthermore, I can't generate the same string twice. Old strings will be removed after 6 months.

Pseudo-code

DECLARE @intIterator INT,
 @intMax

SET @intIterator = 1
SET @intMax = 5 (number of strings to generate)

WHILE @intIterator <= @intMax
 BEGIN

  -- GENERATE RANDOM STRING OF 5 NUMERIC DIGITS 
   ???

  -- INSERT INTO DB IF DOESN'T ALREADY EXIST 
  INSERT INTO TSTRINGS
  SELECT @RANDOMSTRING

  IF @@ERROR = 0
    SET @intIterator = @intIterator + 1

 END

I know this probably isn't the best way to do it, so advice is appreciated. But really looking for ideas on how to generate the numeric 5 length strings.


The "obvious" way can be described as "key = random; while (key already selected) { key = random }". It works, but the birthday paradox implies our odds of key collision increase at an alarming exponential rate in proportion to the number of keys already used. So, selecting a random key takes on average exponentially longer with each new key, and is very likely to get trapped in an infinite or arbitrarily long loop eventually.

You're much better off generating your list of keys up front as follows:

  • Hold a table UniqueKeys containing all the precomputed strings '00000' .. '99999' in addition to a keyOrder field which is always initialized to newId() on insert. keyOrder should be indexed.

  • When you need to "generate" a string, you can SELECT TOP 1 key FROM UniqueKeys ORDER BY keyOrder, which will pull the next available key in nearly constant time. Now that you have a key, you can delete it from UniqueKeys to prevent it from being reused.

  • Every six months, truncate and regenerate your UniqueKeys table.

Advantage of this style is the relatively straightforward implementation, nearly constant time to generate the next key, and avoiding the nasty "check if exists in a loop" scenario described above.


All in one. This should find the @intMax remaining values if you have (100000 - @intMax) rows already with just @intMax permutations left

INSERT TOP (@intMax) MyTable (RndColumn)
SELECT
    RndValue
FROM
    (
    SELECT DISTINCT TOP 100000 -- covers potential range from 00000 to 99999
        RIGHT('00000' + CAST(ABS(CHECKSUM(NEWID())) AS varchar(10)), 5) AS RndValue
    FROM
        sys.columns c1, sys.columns c2
    ) foo
WHERE
    NOT EXISTS (SELECT *
        FROM
            MyTable T
        WHERE
            T.RndColumn = foo.RndValue


Do you need the logic to check if the number exists?

You could use the following to generate your random number:

CREATE FUNCTION RandNumber()
RETURNS float
AS
  BEGIN
  RETURN (SELECT RandNumber FROM vRandNumber)
END

CREATE FUNCTION RandNumber2(@Min int, @Max int)
RETURNS float
AS
 BEGIN
 RETURN @Min + (select RandNumber from RetRandNumber) * (@Max-@Min)
END

Then just call RandNumber in your select.

Here's the site that I found with that script: Here


Something like this?

CREATE FUNCTION RandNumber2(@Min int, @Max int)
RETURNS float
AS
BEGIN
  DECLARE @TheNumber INT

  SET @TheNumber = (SELECT CONVERT(INT, Rand()*(@Max-@Min)+@Min))

  WHILE (SELECT COUNT(IndexColumn) WHERE CONVERT(INT, IndexColumn) = @TheNumber) > 0
  BEGIN
    -- Do it again - we have a collision
    SET @TheNumber = (SELECT CONVERT(INT, Rand()*(@Max-@Min)+@Min))
  END

  DECLARE @Result VARCHAR(5)
  SET @Result = RIGHT('00000' + CONVERT(VARCHAR(5), @TheNumber), 5)

  RETURN @Result

END


One way to generate such a string is:

DECLARE @Foo char(5)
SET @Foo = right(str((checksum(newid()) + 100000), 11, 0), 5)
PRINT @Foo

As for uniqueness, you'd have to build a loop around the table holding the (indexed!) pre-existing values, only exiting the loop when a "new" id is generated. You could hit concurrency problems if two separate processes somehow generate the same value, where the first doesn't enter it in the table before the second checks for existance... but a lot depends on when and how this value is actually used.


Here's a set-based approach, using SQL 2005 syntax (would be a little easier with SQL 2008, but you didn't specify). Also, if you had a numbers table, a large chunk of it can be cut out.

No looping, no duplicates, and should be nearly instantaneous (assuming the target column is indexed).

DECLARE @intMax integer
SET @intMax = 5

INSERT INTO TSTRINGS
SELECT q.nString
    FROM ( SELECT ROW_NUMBER() OVER (ORDER BY (newID())) AS N2,
             RIGHT(REPLICATE('0', 5) + CONVERT(varchar(5), N), 5) as nString
           FROM   --the subquery below could be replaced by a numbers table
                  (SELECT TOP 100000
                  ROW_NUMBER() OVER (ORDER BY (ac1.Object_ID))-1 AS N
            FROM Master.sys.columns ac1
                 CROSS JOIN Master.sys.columns ac2
                 CROSS JOIN Master.sys.columns ac3) numbers 

          WHERE RIGHT(REPLICATE('0', 5) + CONVERT(varchar(5), N), 5)
            NOT IN (SELECT nString FROM TSTRINGS) --check to see if reused
    ) q
    WHERE q.N2 <= @intMax
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜