What are some of the challenging problems to adapt to map reduce framework?
I have developed around 20 jobs on map reduce including the pagerank algorithm. I never found any challenging pro开发者_Go百科blems to adapt to mapreduce framework online. I would like to improve my skills. Is there such a guide?
What you are looking for is Data intensive programming task. Here is a similar question already posted at StackOverflow. I thought of suggesting this project because corpus from wikipedia is easily available, but as you can see it is already in progress.
Run a squid reverse proxy and collect the logs of those over a period of time. Now use those logs and try to derive meaningful interpretation of those and store them in suitable database for querying. This could be good project to do.
精彩评论