开发者

Jaccard Distance

I have this problem in calculating Jaccard Distance for Sets (Bit-Vectors):

p1 = 10111;

p2 = 10011.

S开发者_如何学运维ize of intersection = 3; (How could we find it out?)

Size of union = 4, (How could we find it out?)

Jaccard similarity = (intersection/union) = 3/4.

Jaccard Distance = 1 – (Jaccard similarity) = (1-3/4) = 1/4.

But I don't understand how could we find out the "intersection" and "union" of the two vectors.

Please help me.

Thanks alot.


Size of intersection = 3; (How could we find it out?)

Amount of set bits of p1&p2 = 10011

Size of union = 4, (How could we find it out?)

Amount of set bits of p1|p2 = 10111

Vector here means binary array where i-th bit means does i-th element present in this set.


If p1 = 10111 and p2 = 10011,

The total number of each combination attributes for p1 and p2:

  • M11 = total number of attributes where p1 & p2 have a value 1,
  • M01 = total number of attributes where p1 has a value 0 & p2 has a value 1,
  • M10 = total number of attributes where p1 has a value 1 & p2 has a value 0,
  • M00 = total number of attributes where p1 & p2 have a value 0.

Jaccard similarity coefficient = J = intersection/union = M11/(M01 + M10 + M11) = 3 / (0 + 1 + 3) = 3/4,

Jaccard distance = J' = 1 - J = 1 - 3/4 = 1/4, Or J' = 1 - (M11/(M01 + M10 + M11)) = (M01 + M10)/(M01 + M10 + M11) = (0 + 1)/(0 + 1 + 3) = 1/4

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜