Mahout rescorer implementation
I'd like to weight all of my PearsonItemSimilarity values between two items by the number of coratings they share divided by 50.
Or in other words update the generic pearson similarity between two items (items a and b for instance) accordingly -- similarity_new_ab = similarity_ab*numCoRatings_ab/50
How does one get the number of coratings bet开发者_运维知识库ween two games using the existing mahout framework.
Can someone please link me to (or illustrate) an example implementation of a rescorer?
My reasoning for doing this is as follows,
I postulate that most of the Pearson-similarities calculated are based on a small number (1 or 2 in most cases) of coratings. This would lead to the games sharing a Pearson correlation of 1 with each other, which in fact would probably not be the case should more coratings exist.
To account for this, I'd like up change these "naive" Pearson similarities to a similarity that is also based on the number of co-ratings.
I thought this is what the rescorer was built for, but I guess I was wrong.
You want the method getNumUsersWithPreferenceFor()
on DataModel
and pass it the two item IDs.
I don't think this is the best thing to do for this similarity metric. If you are using co-occurrence, look at LogLikelihoodSimilarity
instead.
This has nothing to do with Rescorer
though, what is your question there?
精彩评论