Search selection
For a C# program that I am writing, I need to compare similarities in two entities (can be documents, animals, or almost anything). Based on certain properties, I calculate the similarities between the documents (or entities). I put their similarities in a table as below
X Y Z
A|0.6 |0.5 |0.4
B|0.6 |0.4 |0.2
C|0.6 |0.3 |0.6
I want to find the best matching pairs (eg: AX, BY, CZ) based on the highest similarity score. High score indicates the higher similarity.
My problem开发者_运维技巧 arises when there is a tie between similarity values. For example, AX and CZ have both 0.6. How do I decide which two pairs to select? Are there any procedures/theories for this kind of problems?
Thanks.
In general, tie-breaking methods are going to depend on the context of the problem. In some cases, you want to report all the tying results. In other situations, you can use an arbitrary means of selection such as which one is alphabetically first. Finally, you may choose to have a secondary characteristic which is only evaluated in the case of a tie in the primary characteristic.
Additionally, you can always report one or more and then alert the user that there was a tie to allow him or her to decide for him- or herself.
In this case, the similarities you should be looking for are: - Value - Row - Column
Objects which have any of the above in common are "similar". You could assign a weighting to each property, so that objects which have the same value are more similar than objects which are in the same column. Also, objects which have the same value and are in the same column are more similar than objects with just the same value.
Depending on whether there are any natural ranges occurring in your data, you could also consider comparing ranges. For example two numbers in the range 0-0.5 might be somewhat similar.
精彩评论