开发者

Checking Multiple Conditions

I have a data frame and want to know if a a certain string is present. I want to know if any of the values in df[,1] contain anything from inscompany.

df = data.frame(company=c("KMart", "Shelter"), var2=c(5,7))
if( df[,1] == inscompany ) print("YES")
inscompany <- c("21st Century Auto Insurance", "AAA Auto Insurance", "AARP Auto Insurance",
        "Allstate Auto Insurance", "American Family Auto Insurance", "Eastwood Auto Insurance",
        "Erie Auto Insurance", "Farmers Auto Insurance", "GMAC Auto Insurance", "Hartford Auto Insurance",
        "Infinity Auto Insurance", "Mercury Auto Insurance", "Nationwide Auto Insurance", "Progressive Auto Insurance",
        "Shelter Insurance Company", "Titan Auto Insurance", "Travelers Auto Insurance", "USAA Auto In开发者_JAVA百科surance")

I get an error message that it can only check the first value of inscompany to df[,1].

Help!


You want %in%. Here is an exampe:

R> chk <- c("A", "B", "Z")    # some text
R> chk %in% LETTERS[1:13]     # check for presence in first half of alphabet
[1]  TRUE  TRUE FALSE
R> 

The match() function is related, see the help page for details.


I think match and %in% won't work for partial matching. grepl gives a logical (TRUE/FALSE) result depending on whether the target string is contained or not; I used ^ to enforce a match at the beginning of the string only (you may not need that). any and sapply are needed to scale up to the many-to-many match. If you just want to know whether any of the strings match, you need one more any around the whole thing.

 sapply(df$company,function(x) any(grepl(paste("^",x,sep=""),inscompany)))
[1] FALSE  TRUE
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜