Some languages, particularly Slavic languages, change the endings of people\'s names according to the grammatical context. (For those of you who know grammar or studied languages that do this to words
I know this is a long shot, but does anyone know of a dataset of English words that has stress information by syllable?Something as simple as the following would be fantasti开发者_如何学Pythonc:
I have a large dataset (c. 40G) that I want to use for some NLP (largely embarrassing开发者_C百科ly parallel) over a couple of computers in the lab, to which i do not have root access, and only 1G of
I have a data set with multiple layers of annotation over the underlying text, such as part-of-tags, chunks from a shallow parser, name entities, and others from variousnatural language processing (NL
开发者_JAVA百科I have over 1000 surveys, many of which contains open-ended replies. I would like to be able to \'parse\' in all the words and get a ranking of the most used words (disregarding commo
I\'m trying to analyze some UTF-8 encoded documents in a way that recognizes different language characters. For my approach to work I need to ignore non-language characters, such as control char开发者
I am parsing using a pretty large grammar (1.1 GB, it\'s data-oriented parsing). The parser I use (bitpar) is said to be optimized for highly ambiguous grammars. I\'m getting this error:
The problem: Given a set of hand categorized strings (or a set of ordered vectors of strings) generate a categorize function to categorize more input. In my case, that data (or most of it) is not natu
I am working on one feature i.e. to apply language segmentation rules (grammatical) for Latin based language (English currently).
I don\'t have time to read or digest long intricate discussions on theoretical concepts around NLP (or go get my PHD). That said, I have read a few and it\'s a damn interesting field. The problem is I