开发者

User targeting algorithm

I have an application that visitors come and go.

I m working with a data provider that gives me information about users such as their gender, age, location, and information about their personalities etc.

Now, i d like to target these users with appropriate content.

In short, I have content and users with their personality information, i need to display the best content that matches their character, personality etc.

I am aware that given a list of content 开发者_如何学运维and a user, i will be searching for the best possible content for the user, ie: A* search.

How would you design and implement such application?

Which algorithm(s)/data structures you would use? graphs ? adjacency list? matrix?


I would suggest solving this problem using Bayesian inference.

Bayesian Classifiers

As the problem is currently stated, the only classification of the content that is available is the distribution of the users which have visited it and the characteristics of those users. The joint probability distribution across all user-characteristic dimensions for all users is the classifier for that content.

So how does one use the above information? Given content A with user access distribution B for all users and a target user characteristic profile C, one can compute the probability that the latter user would be interested in content A. If one performs this computation against all content relative to user profile C, one gets a list of interest probability values for all of the content. Sort that list by the probability values to identify the best possible content for the target user.

In many cases, only a subset of user characteristic parameters may be predictive of the value of a given content item to users. This is a common situation for Bayesian classifiers in general and has led to the development of Bayesian networks, which are structured graphs of key variables and their conditional dependencies. Such networks can be modeled via Bayesian inference methods as well.

Bayesian Network Software

The WEKA Data Mining software is an open-source Java library which implements many common classification methods including Bayesian network classifiers, and it is well worth trying out. I can't recommend any specific C# equivalent packages, but a quick web search identified at least one commercial Bayesian package for .NET, Bayes Server.

Recommended Reading

There is a pretty large body of literature surrounding bayesian classifiers, and it is a very sound technique that is use in SPAM filtering, drug discovery, etc. Two books that I can recommend for this are listed below. Bolstad's book is for beginners, while Pearl's book is more advanced.

Bolstad, William M. (2007). Introduction to Bayesian Statistics, Second Edition, John Wiley.

Judea Pearl (2000). Causality: Models, Reasoning, and Inference, Cambridge University Press.


Very interesting question!

You're talking about best possible content. But you didn't mention a measurement. I guess under content you've meant some form of advertisement and “best” means most efficient, i.e. having highest CTR.

So you have a function:

f(gender, age, location, personality, ..., advertisement) -> CTR

Each visit you get fixed gender, age, location, etc. Under fixed I mean: you already have this visitor, you can't vary his age. And you have a parameter that you can change: advertisement. Your goal is to maximize CRT.

Varying advertisements you can gather statistics for CTR under different combinations. Once you have minimal initial knowledge you can try to use optimization theory methods, particularly nonlinear programming to find optimal advertisement parameter for given gender, age, location, etc. Continue to gather CTR statistics to make subsequent decisions more and more precise.

P.S. There was a startup showcase on TechCrunch. They did similar thing and have had fantastic results. So if you will success, think about starting your own business ;)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜