I need a library for naïve Bayes large scale, with millions of training examples and +100k b开发者_JAVA百科inary features. It must be an online version (updatable after training). I also need top-k o
I need to generate a vector of u开发者_JAVA百科nigrams, i.e. a vector of all the unique words which appear in a specific text field that I have stored as part of a broader JSON object in MongoDB.
I\'m looking to build a开发者_JS百科n algorithm that can join together sentence parts. So, for example, it would know that
What are the statistical engines that yield bette开发者_JAVA技巧r results than the OpenNLP suite of tools, if any? What I\'m looking for is an engine that picks keywords from texts and provides stemmi
I have been looking at the nlp tag on SO for the past couple of hours and am confident I did not miss anything but if I did, please do point me to the question.
I would like to clarify the relationship between latent Dirichlet allocation (LDA) and the generic task of document clustering.
Closed. This question is opinion-based. It is not currently accepting answers. Want to improve 开发者_运维技巧this question? Update the question so it can be answered with facts and citation
Here is an online programming contest we are planning to have. What could be possible approaches to solving the same?
I have some ideas to do with natural language processing. I will need some grammars of the S -> NP VP
I\'m having some trouble understanding the changes made to the coref resolver in the last version of the Stanford NLP tools.