开发者

How to get synonyms ordered by their occurrence probability from Wordnet

I am searching in Wordnet for synonyms for a big lis开发者_Python百科t of words. The way I have it done it, when some word has more than one synonym, the results are returned in alphabetical order. What I need is to have them ordered by their probability of occurrence, and I would take just the top 1 synonym.

I have used the prolog wordnet database and Syns2Index to convert it into Lucene type index for querying synonyms. Is there a way to get them ordered by their probabilities in this way, or I should use another approach?

Speed not important, this synonym lookup will not be done online.


In case someone stumbles upon this thread, this was the way to go(at least what i needed):

http://lyle.smu.edu/~tspell/jaws/doc/edu/smu/tspell/wordnet/impl/file/ReferenceSynset.html#getTagCount%28java.lang.String%29

tagCount method gives the most likely synset group for every word. The problem again is that synset with highes probability again can have several words. But i guess theres no chance to avoid this


I think that you should do another step (provided that speed is not important).

From the Lucene index, you should build another dictionary in which each word is mapped to a small object that contains the only synonym that its meaning has higher probability of appearance, its meaning, and probability of appearance. I.e., given this code:

class Synonym {
public:
    String name;
    double probability;
    String meaning;
}

Map<String, Synonym> m = new HashMap<String, Synonym>();

... you just have to fill it from the Lucene index.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜