开发者

Subset rows based on values of columns of unknown names and number of columns

I am sure I have a very 开发者_如何学编程basic question but I am frustrated after searching for the idea on how to accomplish subsetting (getting row numbers) of some data frame/matrix which can have any number of columns and column names change all the time. I would like to find only rows (indexes) of the data frame for which any of the columns is greater than 0. Since column names and number of columns is unknown I do not know how to do this...

An example:

# these are the terms I am looking in
terms <- c("beats", "revs", "revenue", "earnings")
# dict <- Dictionary(terms)
# dictStudy <- inspect(DocumentTermMatrix(mydata.corpus.tmp, list(dictionary = dict)))

dictStudy <- data.frame(beats=c(0, 0, 0, 1, 0, 2), revs=c(0, 0, 0, 1, 0, 1), revenue=c(0, 0, 0, 0, 0, 0), earnings=c(1, 0, 0, 1, 0, 1)) 
ss <- expression(terms > 0)
dictStudy.matching <- subset(dictStudy, eval(ss))

I was hoping that expression and eval would save me, but I can not figure this out.

How to find only rows in a data frame that have any of the columns > 0?


I'm assuming you mean you want the rows where at least one element of that row is greater than zero (i.e. any of the columns are greater than zero).

> which(apply(dictStudy,1,function(x) any(x > 0)))
[1] 1 4 6

As Tommy points out below, this assumes that all your columns are in fact numeric. You could sidestep this by subseting your data frame to pull out only those columns that are numeric:

> which(apply(dictStudy[,sapply(dictStudy,is.numeric)],1,function(x) any(x > 0)))
[1] 1 4 6
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜