开发者

What is a good first-implementation for learning machine learning? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Update the question so it foc开发者_JAVA百科uses on one problem only by editing this post.

Closed 8 years ago.

Improve this question

I find learning new topics comes best with an easy implementation to code to get the idea. This is how I learned genetic algorithms and genetic programming. What would be some good introductory programs to write to get started with machine learning?

Preferably, let any referenced resources be accessible online so the community can benefit


What language(s) will you develop in? If you are flexible, I recommend Matlab, python and R as good candidates. These are some of the more common languages used to develop and evaluate algorithms. They facilitate rapid algorithm development and evaluation, data manipulation and visualization. Most of the popular ML algorithms are also available as libraries (with source).

I'd start by focusing on basic classification and/or clustering exercises in R2. It's easier to visualize, and it's usually sufficient for exploring issues in ML, like risk, class imbalance, noisy labels, online vs. offline training, etc. Create a data set from everyday life, or a problem you are interested in. Or use a classic, like the Iris data set, so you can compare your progress to published literature. You can find the Iris data set at:

  • http://en.wikipedia.org/wiki/Iris_flower_data_set , or
  • http://archive.ics.uci.edu/ml/datasets/Iris

One of its nice features is that it has one class, 'setosa', that is easily linearly separable from the others.

Once you pick a couple of interesting data sets, begin by implementing some standard classifiers and examining their performance. This is a good short list of classifiers to learn:

  • k-nearest neighbors
  • linear discriminant analysis
  • decision trees (e.g., C4.5)
  • support vector machines (e.g., via LibSVM)
  • boosting (with stumps)
  • naive bayes classifier

With the Iris data set and one of the languages I mention, you can easily do a mini-study using any of the classifiers quickly (minutes to hours, depending on your speed).

Edit: You can google "Iris data classification" to find lots of examples. Here is a classification demo document by Mathworks using Iris data set:

http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/classdemo.html


I think you can write a "Naive Bayes" classifier for junk email filtering. You can get a lot of information from this book.

http://nlp.stanford.edu/IR-book/information-retrieval-book.html


Decision tree. It is frequently used in classification tasks and has a lot of variants. Tom Mitchell's book is a good reference to implement it.


Neural nets may be the easiest thing to implement first, and they're fairly thoroughly covered throughout literature.


There is something called books; are you familiar with those? When I was exploring AI two decades ago, there were many books. I guess now that the internet exists, books are archaic, but you can probably find some in an ancient library.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜