开发者

Cross Entropy for Language modelling

im currently working on a classification task using language modelling. The first part of the project involved using n-gram language models to classify documents using c5.0. The final part of the project requires me to use cross entropy to model each class and classify test cases against these models.

Does anyone have have 开发者_运维知识库experience in using cross entropy, or links to information about how to use a cross entropy model for sampling data? Any information at all would be great! Thanks


You can get theoretic background on using cross-entropy with language models on various textbooks, e.g. "Speech and language processing" by Jurafsky & Martin, pages 116-118 in the 2nd edition. As to concrete usage, in most language modeling tools the cross-entropy is not directly measured, but the 'Perplexity', which is the exp of the cross-entropy. The perplexity, in turn, can be used to classify documents. see, e.g. the documentation for the command 'evallm' in SLM, the Carnegie-Melon university language modeling tools (http://www.speech.cs.cmu.edu/SLM/toolkit_documentation.html)

good luck :)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜