Is there a perfect hash function for the combined input sets of IMEI numbers and MAC addresses? (C implementation)
I'm looking for a hash function that I can use to give uniform unique IDs to devices that connect to our network either using a GSM modem or an ethernet connection.
So for any given device I have 开发者_运维知识库either an IMEI number or a MAC address hard-coded that I can use to generate the hash.
I've been researching hash functions for the last few hours, reading up on the different non-cryptographic and cryptographic hashes that I might want to use. My focus is low-collisions over performance, as the hash will not be calculated very often.
My front-runners are MD5, FNV-1a, MurmurHash2, Hsieh, and DJB.
Whatever hash I use will have to be implemented in C and will be used on a microcontroller with a tiny processor.
I know that the trick to choosing a good hash function for your needs is knowing what sort of input you're going to be feeding it.
The reason I'm asking this question is that the idea popped into my head that both IMEI and MAC have finite lengths and ranges, so perhaps there exists a fairly simple hash function that can cover the full sets of both and not have collisions. (Thus, a perfect hash function)
An IMEI number is 15 decimal digits long (12-13 bytes in hex?), and a MAC address is 6 bytes. Mulling it over I don't think you would have collisions between the two sets of input numbers, but feel free to correct me if that is wrong. If you did could you do something to prevent it? Add some seed to one of the sets?
Am I on the right track? Is finding perfect hash function for these combined sets possible?
Thanks!
Update
Thanks for the answers and comments. I ended up using the identity function ;) as my hash function, and then also using a mask since there is potential overlap across the sets of numbers.
IMEI, IMEISV, and MAC will all fit in 6.5 bytes or less, so I am storing my values in 7 bytes and then doing a bitwise OR on the first byte with a mask based on which set the number comes from, to ensure they are unique across all sets.
There's no way to make a perfect hash over an unknown, growing input set. You could simply make the field one bit larger than whichever of IMEI or MAC is larger, and use that bit to flag which type of identifier it is, along with the entire IMEI/MAC. Anything smaller will have collisions, but they're probably quite rare.
精彩评论