Sampling on Yahoo! Answers
I wonder what is the best way to sample,say, 1000 questions,completely randomly from Yahoo! Answer. I want to achieve this complete randomness in which I will totally ignore the categories o开发者_如何学Pythonr date of posting etc. Doing this manually may result in bias,so could anyone give some suggestions here,like using Yahoo! Answer API or sth. Thanks a lot.
I do not know if it is correct solution from a formal point of view but I would use yahoo boss search to retrieve 4000 questions, and than randomly pick up 1000. Using a search engine let you to retrieve the most important (highly ranked/linked) questions. You can play around with queries for the search engine to get questions of all kinds - most popular and the worst ones... There is also Yahoo Answer API, which provide search functionality but I have not used it so I can not say how good it is.
精彩评论