Is encoding an entity type in the top few bits of a 64 identifier better than a guid?
WHY
- I would like the ability to have unique identifiers across all entities in my system.
- Being able to identifier the owner "type" form the id would be great, even better if i dont have to hit the db.
GUID
DOWNS:
- Cant tell type without querying db.
HIDING TYPE IN THE TOP X BITS
Lets face it 64 bits will never run out. The idea is to use perhaps the top say X bits to encode the "type" of an entity. This will mean that ids will be exceptionally large.
NOTES
- The top bits are masked and cleaned to get the true id for db stuff.
UPS
- Can easily determine type of id by examininig the right bits.
ENCODING TYPE IN BOTTOM X BITS
Rather than encode in the top few bits, shift the true id, and encode in the bottom bits. If i wish to reserve 5 bits for "types" then a true id of 3 + an id type of X would turn out to be 5 + 3 << X.
NOTES
- Just like the other encoding method described above the type bits are masked out whe开发者_StackOverflow中文版n doing db stuff.
DOWNS
- If i need more than X bits for types at some later stage then i will need to encode the remaining bits in the top end of the 64 bit value.
UPS
- No need to query db to determine "type" of entity from id.
OPINIONS
Which option is best ? I personally like the last option, picking a good size for X leaves enough room for expansion without overly large ids.
I've used this technique, and I found it useful to use the top bits, rather than the bottom bits. This is because if you later decide to expand the number of type bits used to allow for more types, the chances are good that the bit is 'unused' by any existing ids.
Also, for a small number of types, it's possible to get accustomed the ranges used for each type, and after a while you just know the type by looking at the id, which can be useful when debugging!
I've used something similar to this before, but with a slight variant: for production code, I just generate a random GUID as normal. Aside from anything else, this makes it really simple to write that production code and be confident that the randomness hasn't been compromised.
However, for test data, I specified GUIDs explicitly and baked that knowledge into the test code infrastructure too. Each entity GUID was basically a combination of the "type number" and a counter for how many entities had been created. It made the test data easier to understand: if something went wrong it was much easier to find "entity X number 3" than try to look for an arbitrary GUID.
精彩评论