Extracting synonymous terms from wordnet using synonym()
Supposed I am pulling the synonyms of "help" by the function of synonyms() from wordnet and get the followings:
Str = synonyms("help")
Str
[1] "c(\"aid\", \"assist\", \"assistance\", \"help\")"
[2] "c(\"aid\", \"assistance\", \"help\")"
[3] "c(\"assistant\", \"helper\", \"help\", \"supporter\")"
[4] "c(\"avail\", \"help\", \"service\")"
Then I can get a one character string using
unique(unlist(lapply(parse(text=Str),eval)))
at the end that looks like this:
[1] "aid" "assist" "assistance" "help" "assistant" "helper" "supporter"
[8] "avail" "service"
The above process was suggested by Gabor Grothendieck. His/Her solution is good, b开发者_开发问答ut I still couldn't figure out that if I change the query term into "company", "boy", or someone else, an error message will be responsed.
One possible reason maybe due to the "sixth" synonym of "company" (please see below) is a single term and does not follow the format of "c(\"company\")".
synonyms("company")
[1] "c(\"caller\", \"company\")"
[2] "c(\"company\", \"companionship\", \"fellowship\", \"society\")"
[3] "c(\"company\", \"troupe\")"
[4] "c(\"party\", \"company\")"
[5] "c(\"ship's company\", \"company\")"
[6] "company"
Could someone kindly help me to solve this problem. Many thanks.
You can solve this by creating a little helper function that uses R's try
mechanism to catch errors. In this case, if the eval
produces an error, then return the original string, else return the result of eval
:
Create a helper function:
evalOrValue <- function(expr, ...){
z <- try(eval(expr, ...), TRUE)
if(inherits(z, "try-error")) as.character(expr) else unlist(z)
}
unique(unlist(sapply(parse(text=Str), evalOrValue)))
Produces:
[1] "caller" "company" "companionship"
[4] "fellowship" "society" "troupe"
[7] "party" "ship's company"
I reproduced your data and then used dput
to reproduce it here:
Str <- c("c(\"caller\", \"company\")", "c(\"company\", \"companionship\", \"fellowship\", \"society\")",
"c(\"company\", \"troupe\")", "c(\"party\", \"company\")", "c(\"ship's company\", \"company\")",
"company")
Those synonyms are in a form that looks like an expression, so you should be able to parse them as you illustrated. BUT: When I execute your original code above I get an error from the synonyms call because you included no part-of-speech argument.
> synonyms("help")
Error in charmatch(x, WN_synset_types) :
argument "pos" is missing, with no default
Observe that the code of synonyms
uses getSynonyms
and that its code has a unique
wrapped around it so all of the pre-processing you are doing is no longer needed (if you update);:
> synonyms("company", "NOUN")
[1] "caller" "companionship" "company"
[4] "fellowship" "party" "ship's company"
[7] "society" "troupe"
> synonyms
function (word, pos)
{
filter <- getTermFilter("ExactMatchFilter", word, TRUE)
terms <- getIndexTerms(pos, 1L, filter)
if (is.null(terms))
character()
else getSynonyms(terms[[1L]])
}
<environment: namespace:wordnet>
> getSynonyms
function (indexterm)
{
synsets <- .jcall(indexterm, "[Lcom/nexagis/jawbone/Synset;",
"getSynsets")
sort(unique(unlist(lapply(synsets, getWord))))
}
<environment: namespace:wordnet>
精彩评论