Randomly selecting values from an existing matrix after adding a vector (in R)

2023-03-23 06:01 问答作者：

Thank you so much for your help in advance!

I am trying to modify an existing matrix such that, when a new line is added to the matrix, it removes values from the preexisting matrix.

For example, I have the matrix:

[,1] [,2] [,3] [,4]
 1     1    0    0
 0     1    0    0
 1     0    1    0
 0     0    1    1

I want to add another vector, I.vec, which has two values (I.vec=c(0,1,1,0)). This is easy enough to do. I just rbind it to the matrix. Now, for every column where I.vec is equal to 1, I want to randomly select a value from the other rows and make it zero. Ideally, this would end up with a matrix like:

[,1] [,2] [,3] [,4]
 1     0    0    0
 0     1    0    0
 1     0    0    0
 0     0    1    1
 0     1    1    0

But each time I run the iteration, I want it to randomly sample again.

So this is what I have tried:

mat1<-matrix(c(1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,1),byrow=T, nrow=4)
I.vec<-c(0,1,1,0)
mat.I<-rbind(mat1,I.vec)
mat.I.r<-mat.I
d1<-mat.I[,which(mat.I[5,]==1)]
mat.I.r[sample(which(d1[1:4]==1),1),which(mat.I[5,]==1)]<-0

But this only deletes one of the two values I would like to delete. I have also tr开发者_开发问答ied variations on subsetting the matrix, but I have not been successful.

Thank you again!

There is a little bit of ambiguity in the description from the OP, so two solutions are suggested:

Assuming that only existing `1`s in relevant columns can be set to `0`

I'll just alter the original function (see below). The change is to the line defining rows. I now have (there was a bug in the original - the version below is revised to handle deal with the bug):

rows <- sapply(seq_along(cols), 
                   function(x, mat, cols) {
                       ones <- which(mat[,cols[x]] == 1L)
                       out <- if(length(ones) == 1L) {
                                  ones
                              } else {
                                  sample(ones, 1)
                       }
                       out
                   }, mat = mat, cols = cols)

Basically, what this does is, for each column we need to swap a 1 to a 0, we work out which rows of the column contain 1s and sample one of these.

Edit: We have to handle the case where there is only a single 1 in a column. If we just sample from a length 1 vector, R's sample() will treat it as if we wanted to sample from the set seq_len(n) not from the length 1 set n. We handle this now with an if, else statement.

We have to do this individually for each column so we get the correct rows. I suppose we could do some nice manipulation to avoid repeated calls to which() and sample(), but how escapes me at the moment, because we do have to handle the case where there is only one 1 in the column. Here's the finished function (updated to handle the length 1 sample bug in the original):

foo <- function(mat, vec) {
    nr <- nrow(mat)
    nc <- ncol(mat)

    cols <- which(vec == 1L)
    rows <- sapply(seq_along(cols), 
                   function(x, mat, cols) {
                       ones <- which(mat[,cols[x]] == 1L)
                       out <- if(length(ones) == 1L) {
                                  ones
                              } else {
                                  sample(ones, 1)
                              }
                       out
                   }, mat = mat, cols = cols)

    ind <- (nr*(cols-1)) + rows
    mat[ind] <- 0

    mat <- rbind(mat, vec)
    rownames(mat) <- NULL

    mat
}

and here it is in action:

> set.seed(2)
> foo(mat1, ivec)
     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    1    1    0

and it works when there is only one 1 in a column we want to do a swap in:

> foo(mat1, c(0,0,1,1))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    0    1    1

Original Answer: Assuming any value in a relevant column can be set to zero

Here is a vectorised answer, where we treat the matrix as a vector when doing the replacement. Using the example data:

mat1 <- matrix(c(1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,1), byrow = TRUE, nrow = 4)
ivec <- c(0,1,1,0)

## Set a seed to make reproducible
set.seed(2)

## number of rows and columns of our matrix
nr <- nrow(mat1)
nc <- ncol(mat1)

## which of ivec are 1L
cols <- which(ivec == 1L)

## sample length(cols) row indices, with replacement
## so same row can be drawn more than once
rows <- sample(seq_len(nr), length(cols), replace = TRUE)

## Compute the index of each rows cols combination
## if we treated mat1 as a vector
ind <- (nr*(cols-1)) + rows
## ind should be of length length(cols)

## copy for illustration
mat2 <- mat1

## replace the indices we want with 0, note sub-setting as a vector
mat2[ind] <- 0

## bind on ivec
mat2 <- rbind(mat2, ivec)

This gives us:

> mat2
     [,1] [,2] [,3] [,4]
        1    0    0    0
        0    1    0    0
        1    0    0    0
        0    0    1    1
ivec    0    1    1    0

If I were doing this more than once or twice, I'd wrap this in a function:

foo <- function(mat, vec) {
    nr <- nrow(mat)
    nc <- ncol(mat)

    cols <- which(vec == 1L)
    rows <- sample(seq_len(nr), length(cols), replace = TRUE)

    ind <- (nr*(cols-1)) + rows
    mat[ind] <- 0

    mat <- rbind(mat, vec)
    rownames(mat) <- NULL

    mat
}

Which gives:

> foo(mat1, ivec)
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    1    1    0

If you wanted to do this for multiple ivecs, growing mat1 each time, then you probably don't want to do that in a loop as growing objects is slow (it involves copies etc). But you could just modify the definition of ind to include the extra n rows you bind on for the n ivecs.

You could try something like this. Having 'nrow' in there will allow you to run it multiple times with other 'I.vec's. I tried to do this in a single line with 'apply' but couldn't get a matrix to come out again.

mat1<-matrix(c(1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,1),byrow=T, nrow=4)
I.vec<-c(0,1,1,0)
mat.I.r<-rbind(mat1,I.vec)

for(i in 1:ncol(mat.I.r))
  {
  ifelse(mat.I.r[nrow(mat.I.r),i]==1, mat.I.r[sample(which(mat.I.r[1:(nrow(mat.I.r)-1),i]==1),1), i] <- 0, "")
  }
mat.I.r

继续阅读：matrix sampling

Randomly selecting values from an existing matrix after adding a vector (in R)

Assuming that only existing `1`s in relevant columns can be set to `0`

Original Answer: Assuming any value in a relevant column can be set to zero

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Assuming that only existing 1s in relevant columns can be set to 0

Original Answer: Assuming any value in a relevant column can be set to zero

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Assuming that only existing `1`s in relevant columns can be set to `0`

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？