How many times can you randomly generate a GUID before you risk duplicates? (.NET)
Mathematically I suppose it's possible that even two random GUIDs generated using the built in method in the .NET framework are identical, but roughly how likely are they to clash if you generate hundreds or thousands?
If you generated one for every copy of Windows in the world, would they clash?
The reason开发者_StackOverflow I ask is because I have a program that creates a lot of objects, and destroys some too, and I am wondering about the likelihood of any of those objects (including the destroyed ones) having identical GUIDs.
There are ~3E38 possible GUID values. But the Birthday Paradox cuts the 50/50 odds to producing a duplicate GUID to ~1E19. While still an enormous number, comparing quite favorably to the odds that your machine will be destroyed by a meteor impact first, the system clock is used to ensure no duplicates can occur.
Many large and mission critical dbase apps use a GUID as the primary key in a table. Don't hesitate to follow their lead.
A GUID has components based on
Time (System clock)
Space (System MAC address)
Random numbers
So if one is generated for each machine in the world at the sam etime, they will differ by their MAC and random numbers
Here's a helpful link. http://blogs.msdn.com/oldnewthing/archive/2008/06/27/8659071.aspx
It is hard to calculate the chances without knowing the inner-details of the GUID-generator's implementation.
You can use combinatorics to get the numbers, but that will only help you assuming that the combinations are equally-likely. Therefore, without any statistical knowledge of the implementation - it would be hard to tell the real chances.
As opposed to what Midhat implies (if i understood him correctly), GUID collisions are possible. Built-in Random Number Generators are usually implemented using a timestamp-based seed. MAC addresses are not unique by nature, as they can be overwritten in many situations (and they are, at least in some cases i know of). It is possible that two GUID-generators will gain the same input and thus yield the same output.
GUIDs are 128-bit long, so "there is enough for everyone to use", but that does not guarantee that collisions won't occur.
Having spent the past 25 years working with RPC and COM (where GUIDs and UUIDs are critical) and working with distributed databases where GUIDs are used as unique row identifiers, I have never encountered a collision problem - whether they were generated on a single machines or different machines. Another interesting take on this from MSDN where as rowids they are much longer lived than as objects: http://weblogs.asp.net/wwright/archive/2007/11/04/the-gospel-of-the-guid-and-why-it-matters.aspx
This isn't something you should be at all concerned with. It's just the availability heuristic at work. It's a "risk" that you know about and recognize, so you want to care about it. But there are many other risks millions of times more likely, that we still don't worry about. The wonderful Pro Git book says it best, I think:
A higher probability exists that every member of your programming team will be attacked and killed by wolves in unrelated incidents on the same night.
You would have to be generating million or billions for it to even be a remote possibility.
It would take a really long time!
Just to add to Midhat's right answer, here is a quotation from Eric Lippert's Blog about the situation, where there is no network card installed in the system (therefore, no MAC address):
(Machines that do not have network cards generate special GUIDs which are in a "known to be potentially not unique" range.)
精彩评论