English dictionary as txt or xml file with support of synonyms [closed]
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this questionCan someone point me to where I can download English dictionary as a tx开发者_JAVA百科t or xml file. I am building a simple app for myself and looking for something what I could start using immediately without learning complex API.
Support for synonyms would be great, that is it should be easier to retrieve all the synonyms for a particular word.
It would be absolutely fantastic if the dictionary would be listing British and American spelling of the words where they differ.
Even if it would be small dictionary (a few thousand words) that's OK, I only need it for a small project.
I even would be willing to buy one if the price is reasonable, and the dictionary is easy to use - simple XML would be great.
Any directions please.
WordNet is what you want. It's big, containing over a hundred thousand entries, and it's freely available.
However, it's not stored as XML. To access the data, you'll want to use one of the existing WordNet APIs for your language of choice.
Using the APIs is generally pretty straightforward, so I don't think you have to worry much about "learning (a) complex API". For example, borrowing from the WordNet How to for the Python based Natural Language Toolkit (NLTK):
>>> from nltk.corpus import wordnet
>>>
>>> # Get All Synsets for 'dog'
>>> # This is essentially all senses of the word in the db
>>> wordnet.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'),
Synset('cad.n.01'), Synset('frank.n.02'),Synset('pawl.n.01'),
Synset('andiron.n.01'), Synset('chase.v.01')]
>>> # Get the definition and usage for the first synset
>>> wn.synset('dog.n.01').definition
'a member of the genus Canis (probably descended from the common
wolf) that has been domesticated by man since prehistoric times;
occurs in many breeds'
>>> wn.synset('dog.n.01').examples
['the dog barked all night']
>>> # Get antonyms for 'good'
>>> wordnet.synset('good.a.01').lemmas[0].antonyms()
[Lemma('bad.a.01.bad')]
>>> # Get synonyms for the first noun sense of 'dog'
>>> wordnet.synset('dog.n.01').lemmas
[Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'),
Lemma('dog.n.01.Canis_familiaris')]
>>> # Get synonyms for all senses of 'dog'
>>> for synset in wordnet.synsets('dog'): print synset.lemmas
[Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'),
Lemma('dog.n.01.Canis_familiaris')]
...
[Lemma('frank.n.02.frank'), Lemma('frank.n.02.frankfurter'),
...
While there is an American English bias in WordNet, it supports British spellings and usage. For example, you can look up 'colour' and one of the synsets for 'lift' is 'elevator.n.01'.
Notes on XML
If having the data represented as XML is essential, you could easily use one of the APIs to access the WordNet database and convert it into XML, e.g. see Thinking XML: Querying WordNet as XML.
I know this question is quite old but I had problems myself for finding that as a txt file, so if anyone would be looking synonyms and antonyms txt file database the simplest yet very detailed try https://ia801407.us.archive.org/10/items/synonymsantonyms00ordwiala/synonymsantonyms00ordwiala_djvu.txt .
I have used Roget's thesaurus in the past. It has the synonymy information in plain text files. There is also some java code to help you parse the text.
These pages provides links to a bunch of thesauri/lexical resources some of which are freely downloadable.
http://www.w3.org/2001/sw/Europe/reports/thes/thes_links.html
http://www-a2k.is.tokushima-u.ac.jp/member/kita/NLP/lex.html
Try WordNet.
精彩评论