UUID primary keys and Memcached
With Data being cached so often now and the database is only being accessed when there is new data (and then that data is cached lol) is there even a real performance difference for using Int primary keys vs UUID primary keys.
For example, lets assume im building NetFlix. A new movie gets added to the database, and the movie listing along with associated data is put into a cache.
开发者_Python百科The user search's for a movie (a search server handles this), then finds a listing, clicks on it and the data is retrieved from the cache.
In this entire process the database is never read.
What are your thoughts?
I am the architect a major site similar to Netflix and you are for the most part correct, almost all non-transactional data is cached so optimizing databases ad nauseum does not always pay off. All of our movie titles are pre-loaded into memcached by a recurring task, so for the library portion of the system, the database is never hit by an actual customer.
Still, we don't slouch when designing the database structure and queries because we want the preloader to run as fast and efficiently as possible.
I favor using UUID (GuidCombs actually) for Primary Keys. True it does bloat the indicies some but with 64bit RDBMS everywhere and memory being pretty cheap, I consider the advantages to far outweigh the drawbacks. Not having to wait until you insert to know what your PK will be is my favorite.
I support Chris's answer, but I also want to point out that if attempt to load a LOT of keys into memory at once, then you will use a lot of ram.
Compare:
6ba7b810-9dad-11d1-80b4-00c04fd430c8 - 37 bytes, or 38 if \0 terminated
Where as a 64 bit integer is only 8 bytes. And possibly can be stored in a single register.
To take this to the next level.
Lets say you want to load 100,000 ids into ram.
That is going to be 800,000 bytes (64 bit ints), or 3,800,000 bytes!
Update: Oct 8th, 2010.
Also, validating a UUID string is a little more difficult, you must use a regex.
However, validating an integer is simple. intval() php, or .to_i ruby, and int() for perl.
This improves security with respect to other people sending you suspect data (web bots)
精彩评论