Is there a function that takes two values, lets f(x,y) == f(y,x), and the output is otherwise unique?
I am wondering if there is a way to generate a key based on the relationship between two entities in a way that the key for relationship a->b 开发者_StackOverflow中文版is the same as the key for relationship b->a.
Desirably this would be a hash function which takes either relationship member but generates the same output regardless of the order the members are presented in.
Obviously you could do this with numbers (e.g. add(2,3) is equivalent to add(3,2)). The problem for me is that I do not want add(1,4) to equal add(2,3). Obviously any hash function has overlap but I mean a weak sense of uniqueness.
My naive (and performance undesirable) thought is:
function orderIndifferentHash(string val1, string val2)
{
return stringMerge(hash(val1), hash(val2));
/* String merge will 'add' each character (with wrapping).
The pre-hash is to lengthen strings to at least 32 characters */
}
In your function orderIndifferentHash
you could first order val1
and val2
by some criteria and then apply any hash function you like to get the result.
function orderIndifferentHash( val1, val2 ) {
if( val1 < val2 ) {
first = val1
second = val2
}
else {
first = val2
second = val1
}
hashInput = concat( first, second )
return someHash( hashInput )
// or as an alternative:
// return concat( someHash( first ), someHash( second ) )
}
With numbers, one way to achieve that is for two numbers x
and y
take the x-th prime and y-th prime and calculate the product of these primes. That way you will guarantee the uniqueness of the product for each distinct pair of x
and y
and independence from the argument order. Of course, in order to do that with any practically meaningful efficiency you'll need to keep a prime table for all possible values of x
and y
. If x
and y
are chosen from relatively small range, this will work. But if range is large, the table itself becomes prohibitively impractical, and you'll have no other choice but to accept some probability of collision (like keep a reasonably sized table of N primes and select the x%N-th prime for the given value of x
).
Alternative solution, already mentioned in the other answers is to build a perfect hash function that works on your x
and y
values and then simply concatenate the hashes for x
and y
. The order independence is achieved by pre-sorting x
and y
. Of course, building a perfect hash is only possible for a set of arguments from a reasonably small range.
Something tells me that the primes-based approach will give you the shortest possible hash that satisfies the required conditions. No, not true.
You you are after:
Some function
f(x, y)
such that
f(x, y) == f(y, x)
f(x, y) != f(a, b) => (x == a
andy == b
) or (x == b
andy == a
)
There are going to be absolutely loads of these - off hand the one I can think of is "sorted concatenation":
- Sort
(x, y)
by any ordering - Apply a hash function
u(a)
tox
andy
individually (whereu(a) == u(b)
impliesa == b
, and the length ofu(a)
is constant) - Concatenate
u(x)
andu(y)
.
In this case:
If x == y
then then the two hashes are trivially the same, so without loss of generality x < y
, hence:
f(y, x) = u(x) + u(y) = f(x, y)
Also, if f(x, y) == f(a, b)
, this means that either:
u(x) == u(a)
andu(y) == u(b)
=>x == a
andy == b
, oru(y) == u(a)
andu(x) == u(b)
=>y == a
andx == b
Short version:
Sort x and y, and then apply any hash function where the resulting hash length is constant.
Suppose you have any hash h(x,y)
. Then define f(x,y) = h(x,y) + h(y,x)
. Now you have a symmetric hash.
(If you do a trivial multiplicative "hash" then 1+3 and 2+2 might hash to the same value, but even something like h(x,y) = x*y*y will avoid that--just make sure there's some nonlinearity in at least one argument of the hash function.)
精彩评论