开发者

Data mining? And how can I perform it on my website?

I’m preparing my graduation project from computer science, I made this website and it's running perfectly but my supervisor requested me to apply data mining on the website. But I don’t understand what I should do. The website is a social network, each user will have a profile and blog and access to some e-books that required you to be registered so you can download. The website also contains a music server that contains songs that a registered user can choose a song to download or to add it as a favorite in his profile page, the website contains ads (I used OpenX script), so this i开发者_高级运维s most of the website services where I can perform data mining, the website is www.sy-stu.com.

I need ideas and what is the best way to present it in the interview?


You can ask your professor what was his intention of using data mining. Data mining algorithms can do various tasks, you need first define what you want to accomplish and then find some algorithms for this and technical possibilities.

Some ideas that came to my mind about usage of data mining in your project:

  1. you can use data mining to find what songs (ebooks,etc.) can be favorited by a user based on other people favorites songs (find similarities, probably association rules would be a good algorithm for this).
  2. you can use some clustering algorithms to group users based on some parameters and suggest them that they could become a connections with other people from the same group (if you have something like this)

Good luck!:)


Firstly, ask for clarification from your supervisor. Don't say 'What do you mean?', but ask 'Are you expecting something like this?' because it shows that you've at least thought about it.

If you can't think of anything, or your supervisor is vague, perform some simple data retrieval and analysis, e.g.

  • most active members
  • the most / least popular songs and books.
  • number of ads clicked etc
  • most popular website features

Just elementary analysis should suffice - you aren't doing a statistics degree. Work out the most songs downloaded in a day or per user, the average songs per user, how many users visit each day and how many sign up and never visit.

The purpose is to demostrate that your website is logging all activity, so that when you are asked 'how many books did the 20 most active users download in June' you will be able to work out the answer.

The alternative is a website that just runs and you don't have any knowledge of how your users are behaving and what they are doing, which means you aren't able to focus on things that they find important.


I dont know exactly what kind of data you are trying to mine, but have you check out google analytics? It is very easy to setup, once you register all you need is to include the javascript provided to your web pages. Google analytics will give you plenty of statistic about access to your site information regarding your site and visits. Is that what you need? The data produced is very easy to read as well and will be suitable for you to present I reckon.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜