Architecture & Essential Components of StumbleUpon's Recommendation Engine

2023-04-06 08:19 问答作者：

I would like to know how stumbleupon recommends articles for its users?.

Is it using a neural network or some sort of machine-learning algorithms or is it actually recommending articles based on what the user 'liked' or is it simply recommending articles based on the tags in the interests area?. With tags I mean, using something开发者_JAVA技巧 like item-based collaborative filtering etc.?

First, i have no inside knowledge of S/U's Recommendation Engine. What i do know, i've learned from following this topic for the last few years and from studying the publicly available sources (including StumbleUpon's own posts on their company Site and on their Blog), and of course, as a user of StumbleUpon.

I haven't found a single source, authoritative or otherwise, that comes anywhere close to saying "here's how the S/U Recommendation Engine works", still given that this is arguably the most successful Recommendation Engine ever--the statistics are insane, S/U accounts for over half of all referrals on the Internet, and substantially more than facebook, despite having a fraction of the registered users that facebook has (800 million versus 15 million); what's more S/U is not really a site with a Recommendation Engine, like say, Amazon.com, instead the Site itself is a Recommendation Engine--there is a substantial volume of discussion and gossip among the fairly small group of people who build Recommendation Engines such that if you sift through this, i think it's possible to reliably discren the types of algorithms used, the data sources supplied to them, and how these are connected in a working data flow.

The description below refers to my Diagram at bottom. Each step in the data flow is indicated by a roman numeral. My description proceeds backwards--beginning with the point at which the URL is delivered to the user, hence in actual use step I occurs last, and step V, first.

salmon-colored ovals => data sources

light blue rectangles => predictive algorithms

I. A Web Page recommended to an S/U user is the last step in a multi-step flow

II. The StumbleUpon Recommendation Engine is supplied with data (web pages) from three distinct sources:

web pages tagged with topic tags matching your pre-determined Interests (topics a user has indicated as interests, and which are available to view/revise by clicking the "Settings" Tab on the upper right-hand corner of the logged-in user page);
socially Endorsed Pages (*pages liked by this user's Friends*); and
peer-Endorsed Pages (*pages liked by similar users*);

III. Those sources in turn are results returned by StumbleUpon predictive algorithms (Similar Users refers to users in the same cluster as determined by a Clustering Algorithm, which is perhaps k-means).

IV. The data used fed to the Clustering Engine to train it, is comprised of web pages annotated with user ratings

V. This data set (web pages rated by StumbleUpon users) is also used to train a Supervised Classifier (e.g., multi-layer perceptron, support-vector machine) The output of this supervised classifier is a class label applied to a web page not yet rated by a user.

The single best source i have found which discussed SU's Recommendation Engine in the context of other Recommender Systems is this BetaBeat Post.

Architecture & Essential Components of StumbleUpon's Recommendation Engine

继续阅读：collaborative-filtering machine-learning recommendation-engine similarity

Architecture & Essential Components of StumbleUpon's Recommendation Engine

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？