开发者

Extracting synonymous terms from wordnet using synonym()

Supposed I am pulling the synonyms of "help" by the function of synonyms() from wordnet and get the followings:

Str = synonyms("help")    
Str
[1] "c(\"aid\", \"assist\", \"assistance\", \"help\")"     
[2] "c(\"aid\", \"assistance\", \"help\")"                 
[3] "c(\"assistant\", \"helper\", \"help\", \"supporter\")"
[4] "c(\"avail\", \"help\", \"service\")"  

Then I can get a one character string using

unique(unlist(lapply(parse(text=Str),eval)))

at the end that looks like this:

[1] "aid"        "assist"     "assistance" "help"       "assistant"  "helper"     "supporter" 
[8] "avail"      "service"

The above process was suggested by Gabor Grothendieck. His/Her solution is good, b开发者_开发问答ut I still couldn't figure out that if I change the query term into "company", "boy", or someone else, an error message will be responsed.

One possible reason maybe due to the "sixth" synonym of "company" (please see below) is a single term and does not follow the format of "c(\"company\")".

synonyms("company")

[1] "c(\"caller\", \"company\")"                                    
[2] "c(\"company\", \"companionship\", \"fellowship\", \"society\")"
[3] "c(\"company\", \"troupe\")"                                    
[4] "c(\"party\", \"company\")"                                     
[5] "c(\"ship's company\", \"company\")"                            
[6] "company"

Could someone kindly help me to solve this problem. Many thanks.


You can solve this by creating a little helper function that uses R's try mechanism to catch errors. In this case, if the eval produces an error, then return the original string, else return the result of eval:

Create a helper function:

evalOrValue <- function(expr, ...){
  z <- try(eval(expr, ...), TRUE)
  if(inherits(z, "try-error")) as.character(expr) else unlist(z)
}

unique(unlist(sapply(parse(text=Str), evalOrValue)))

Produces:

[1] "caller"         "company"        "companionship" 
[4] "fellowship"     "society"        "troupe"        
[7] "party"          "ship's company"

I reproduced your data and then used dput to reproduce it here:

Str <- c("c(\"caller\", \"company\")", "c(\"company\", \"companionship\", \"fellowship\", \"society\")", 
"c(\"company\", \"troupe\")", "c(\"party\", \"company\")", "c(\"ship's company\", \"company\")", 
"company")


Those synonyms are in a form that looks like an expression, so you should be able to parse them as you illustrated. BUT: When I execute your original code above I get an error from the synonyms call because you included no part-of-speech argument.

> synonyms("help")
Error in charmatch(x, WN_synset_types) : 
  argument "pos" is missing, with no default

Observe that the code of synonyms uses getSynonyms and that its code has a unique wrapped around it so all of the pre-processing you are doing is no longer needed (if you update);:

> synonyms("company", "NOUN")
[1] "caller"         "companionship"  "company"       
[4] "fellowship"     "party"          "ship's company"
[7] "society"        "troupe"        
> synonyms
function (word, pos) 
{
    filter <- getTermFilter("ExactMatchFilter", word, TRUE)
    terms <- getIndexTerms(pos, 1L, filter)
    if (is.null(terms)) 
        character()
    else getSynonyms(terms[[1L]])
}
<environment: namespace:wordnet>

> getSynonyms
function (indexterm) 
{
    synsets <- .jcall(indexterm, "[Lcom/nexagis/jawbone/Synset;", 
        "getSynsets")
    sort(unique(unlist(lapply(synsets, getWord))))
}
<environment: namespace:wordnet>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜