MySQL String Comparison with Percent Output
I am trying to compare two entries of 6 numbers, each number which can either can be zero or 1 (i.e 100001 or 011101). If 3 out of 6 match, I want the output to be .5. If 2 out of 6 match, i want the output to be .33 etc.
Here are the SQL commands to create the table
CREATE TABLE sim
(sim_key int,
string int);
INSERT INTO sim (sim_key, string)
VALUES (1, 111000);
INSERT INTO sim (sim_key, string)
VALUES (2, 111111);
My desired output to compare the two strings, which share 50% of the characters, and output 50%.
Is it possible to do this sort of 开发者_JAVA百科comparison in SQL? Thanks in advance
This returns the percentage of equal 1 bits in both strings:
select bit_count(conv(a.string, 2, 10) & conv(b.string, 2, 10))/6*100 as percent_match
from sim a, sim b where
a.sim_key=1 and b.sim_key=2;
As you store your bitfields as base 2 representation converted to numbers, we first need to do conversions: conv(a.string, 2, 10)
, conv(b.string, 2, 10)
.
Then we keep only bits that are 1 in each field: conv(a.string, 2, 10) & conv(b.string, 2, 10)
And we count them: bit_count(conv(a.string, 2, 10) & conv(b.string, 2, 10))
And finally we just compute the percentage: bit_count(conv(a.string, 2, 10) & conv(b.string, 2, 10)) / 6 * 100
.
The query returns 50
for 111000
and 111111
.
Here is an other version that also counts matching zeros:
select bit_count((conv(a.string, 2, 10) & conv(b.string, 2, 10)) | ((0xFFFFFFFF>>(32-6))&~(conv(a.string, 2, 10)|conv(b.string, 2, 10))))/6*100 as percent_match
from sim a, sim b where
a.sim_key=1 and b.sim_key=2;
Note that, while this solution works, you should really store this field like this instead:
INSERT INTO sim (sim_key, string)
VALUES (1, conv("111000", 2, 10));
INSERT INTO sim (sim_key, string)
VALUES (2, conv("111111", 2, 10));
Or to update existing data:
UPDATE sim SET string=conv(string, 10, 2);
Then this query gives the same results (if you updated your data as described above):
select bit_count(a.string & b.string)/6*100 as percent_match
from sim a, sim b where
a.sim_key=1 and b.sim_key=2;
And to count zeros too:
select bit_count((a.string & b.string) | ((0xFFFFFFFF>>(32-6))&~(a.string|b.string)))/6*100 as percent_match
from sim a, sim b where
a.sim_key=1 and b.sim_key=2;
(replace 6
s by the size of your bitfields)
Since you are storing them as numbers, you can do this
SELECT BIT_COUNT(s1.string & s2.string) / BIT_COUNT(s1.string | s1.string)
FROM sim s1, sim s2
WHERE s1.sim_key = 1 AND s2.sim_key = 2
精彩评论