开发者

How to remove duplicated rows by a column in an R matrix

I am trying to remove duplicated rows by one column (e.g the 1st column) in an R matrix. How can I extract the unique set by one column from a matrix? I've used

x_1 <- x[unique(x[,1]),]

While th开发者_如何学Ce size is correct, all of the values are NA. So instead, I tried

x_1 <- x[-duplicated(x[,1]),]

But the dimensions were incorrect.


I think you're confused about how subsetting works in R. unique(x[,1]) will return the set of unique values in the first column. If you then try to subset using those values R thinks you're referring to rows of the matrix. So you're likely getting NAs because the values refer to rows that don't exist in the matrix.

Your other attempt runs afoul of the fact that duplicated returns a boolean vector, not a vector of indices. So putting a minus sign in front of it converts it to a vector of 0's and -1's, which again R interprets as trying to refer to rows.

Try replacing the '-' with a '!' in front of duplicated, which is the boolean negation operator. Something like this:

m <- matrix(runif(100),10,10)
m[c(2,5,9),1] <- 1
m[!duplicated(m[,1]),]


As you need the indeces of the unique rows, use duplicated as you tried. The problem was using - instead of !, so try:

x[!duplicated(x[,1]),]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜