How to optimize an database suggestion engine
I`m making an online engine for item-to-item recommending movies. I have made some researches and I think that the best way to implement that is using pearson correlation and make a table with item1, item2 and correlation fields, but the problem is that after each rate of item I have to regenerate the correlation for in the worst case N records (where N is the number of items).
Another think that I read is the following article, but I haven`t thought a way to implemen开发者_C百科t it.
So what is your suggestion to optimize this process? Or any other suggestions? Thanks.
The current approach to this kind of "shopping cart" problems is to use Singular Value Decomposition (SVD). The top-3 participants in the NetFlix Prize all used SVD. SVD is used to perform "dimensionality reduction" on the huge products*persons covariance matrix. The good news is that incremental methods (adding a few observations to the dataset does not imply recomputing the entire matrix) exist.
There is no optimal solution but you can find lots of suggestions by looking at "collaborative-filtering" tags or "recommendation-engine" tags on Stack Overflow.
精彩评论