Which (Java) code, library, or algorithm is best at providing similarities between bit patterns?
I am after code, a library routine, or an algorithm that scores how close two different bit or boolean patterns are. Naturally if they are equal then the score should be 1, while if one is all true and the other all false then the score should 0.
Bit pattern example
The bit patterns that i will be testing are many times never actually equal or the same but sometimes they are very similar.
0001 1111 0000
0000 1111 1100
0000 1110 0000
1110 0000 1111
In the above examples 1 & 2 or 1 & 3 are pretty close if i was to score them perhaps the difference would be something like 96 and 95%. On the other hand 1&4 would definitel开发者_Go百科y be a much lower score maybe 25%.
Note that bit patterns may be of different lengths but scoring should still be possible.
001100
000011110000
The above two patterns would be considered identical.
001100
00110000
The above two patterns would be considered close but not identical, because once "scaled" #2 is different from #1.
If the bit patterns are all the same length, just use the exclusive-or (^
) operator and count how many zeroes remain.
(xor
produces a zero if the two corresponding bits are the same, and a one otherwise).
If they're of different lengths, treat the bit pattern as if it were a string and use something like the Levenshtein distance algorithm.
I've been playing around with fast ways to count the number of matching bits of the bit-wise XOR
comparison. Here's what I think is the fastest way:
int num1, num2; // some bit patterns
int diff = num1 ^ num2;
int score;
for (score = 0; diff > 0; diff >>>= 1)
score += diff & 1;
A score of zero means exact match (assuming results of the same length).
public static int bitwiseEditDistance(int a, int b) {
return Integer.bitCount(a ^ b);
}
Integer.bitCount
is an obscure little bit of the core libraries.
Returns the number of one-bits in the two's complement binary representation of the specified
int
value. This function is sometimes referred to as the population count.
精彩评论