How to solve product recommendation issue like: User __bought__ XXX also __viewed__ YYY
I am currently learning recommender system, learned something about collaborative filtering, User CF, Item CF, it is obvious to use these algorithm to solve problem like: 1) User bought XXX also bought YYY 2) User viewed XXX开发者_开发百科 also viewed YYY
My question is: how to solve problem like: 1) User bought XXX also viewed YYY 2) User viewed XXX also bought YYY ?
Update: Just corrected the title to: " User bought XXX also viewed YYY"
While I am not sure this is really "recommendation", I can tell you how you'd approach recommendations across domains in Mahout. You would build two DataModel
s, one built on user-item purchases and one built on user-item views. You would use the purchase data as the input to a UserSimilarity
or ItemSimilarity
implementation, but, then feed the view data as the input DataModel
to the Recommender
implementation. You would then be computing something more like what you suggest.
Say you have two tables products and sold_products. Each time you sell a product it gets added to the sold_products table. We will say the two tables are related by product_id, order_id is used to group orders together in sold_products.
We will assume the product you are looking at has a product_id of 1234.
- Get a list of order_ids from the last 25 orders which contain the product.
SELECT DISTINCT sold_products.order_id FROM sold_products WHERE product_id=1234 LIMIT 25
- From there we will put all the ids into a string separated by comers
e.g. PO1234,PO435,PO3456....
- Select the product ids from those orders and I like to rank by frequency
SELECT DISTINCT products.* FROM sold_products LEFT JOIN products on products.product_id=sold_products.product_id WHERE sold_products.order_id IN (PO1234,PO435,PO3456....) AND NOT sold_products.product_id=1234 GROUP BY sold_products.product_id ORDER BY COUNT(1) DESC
You need to refer Chapter 2 of OReilly's 'Programming Collective Intelligence' book. To come up with matching products ie., 'Customer who bought this item also bought...' section, you need to
- first collect preferences of various users
- Then find similar users
- Then see other items they purchased or liked.
There are algorithms involved in above steps. More details are given in that book along with python code for those algorithms.
You would generally need two dataset. I .e transaction id & product as first & visitorID & productsviewed as second to arrive at a % of confidence of having any two products being sold(or viewed) together. You can use R (statistic software) & install a package called "arules" to generate these recommendations easily.
Here is a sample code that you may want to check out in R
setwd(“C:/Documents and Settings/rp/Desktop/output”); install.packages(“arules”); library(“arules”); txn = read.transactions(file=”Transactions_sample.csv”, rm.duplicates= FALSE, format=”single”,sep=”,”,cols =c(1,2)); basket_rules <- apriori(txn,parameter = list(sup = 0.5, conf = 0.9,target=”rules”)); inspect(basket_rules);
If you would really want to understand how it works, you may want to check out the white paper at http://www.tatvic.com/resources named as product purchase pattern analysis which indicates how you can do it simply with your web data.
Further, if you want to use a readymade API for it, it is available at http://www.liftsuggest.com/how-lift-product-recommendation-works
精彩评论