wondering if Bayes classifier is right approach?

2023-02-08 12:35 问答作者：

I'm wondering if a Bayes classifier makes sense for an application where the same phrase "served cold" (for example) is "good" when associated some things (beer, soda) but "bad" when related to other things (steak, pizza, burger)?

What I'm wondering is if training a Bayes classifier that ("beer cold" and "soda cold" are "good") cancels out training it that "steak served cold" and "burger served cold" are "bad").

Or, can Bayes (correctly) be trained that "served cold" might be "good" or "bad" depending on what it is associated with?

I found a lot of good info on Bayes, here and elsewhere, but was unable to determine if it's suitable for this type of application where the answer to a phrase being good 开发者_Go百科or bad is "it depends"?

A Naive Bayes classifier assumes independence between attributes. For example, assume you have the following data:

apple fruit red BAD
apple fruit green BAD
banana fruit yellow GOOD
tomato vegetable red GOOD

Independence means that the attributes (name, fruit, color) are independent; for example, that "apple" could be either "fruit" or "vegetable". In this case the attributes "name" and "fruit" are dependent so a Naive Bayes classifier is too naive (it would likely classify "apple fruit yellow" as BAD because it's an apple AND it's a fruit -- but aren't all apples fruits?).

To answer your original question, a Naive Bayes classifer assumes that class (GOOD or BAD) depends upon each attribute independently, which isn't the case -- I like my pizza hot and my soda cold.

EDIT: If you're looking for a classifier that has some utility but in theory could have numerous Type I and Type II errors, Naive Bayes is such a classifier. Naive Bayes is better than nothing, but there's measurable value in using a less naive classifier.

I wouldn't dismiss Bayes as fast as Daniel suggested. The quality (performance in math-speak) of Bayes depends on amount and quality of training data above all, and on the assumptions you make when you develop your algorithm.

To give you a short example, if you feed into it only {'beer cold' => :good, 'pizza cold' => :bad} the word 'cold' won't actually affect classification. It will just decide that all beers are good and all pizzas are bad (see how smart it is? :))

Anyway, the answer is too short to explain this in detail, I would recommend reading Paul Graham's essay on how he developed his spam filter - note that he made his own algorithm based on Bayes and not just off-the-shelf classifier. In my (so far short) experience it seems that you are better off following him in developing specific version of algorithm for specific problem at hand so you have control over various domain specific assumptions.

You can follow my attempts (in ruby) here if you are interested: http://arubyguy.com/2011/03/03/bayes-classification-update/

继续阅读：algorithm bayesian

wondering if Bayes classifier is right approach?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？