vector binding in R

2023-01-13 19:10 问答作者：

I would like to implement a simulation program, which requires the following structure:

It has a for loop, the program will generate an vector in each iteration. I need each generated vector is appended 开发者_高级运维to the existing vector.

I do not how how to do this in R. Thanks for the help.

These answers work, but they all require a call to a non-deterministic function like sample() in the loop. This is not loop-invariant code (it is random each time), but it can still be moved out of the for loop. The trick is to use the n argument and generate all the random numbers you need beforehand (if your problem allows this; some may not, but many do). Now you make one call rather than n calls, which matters if your n is large. Here is a quick example random walk (but many problems can be phrased this way). Also, full disclosure: I haven't had any coffee today, so please point out if you see an error :-)

steps <- 30
n <- 100
directions <- c(-1, 1)

results <- vector('list', n)

for (i in seq_len(n)) {
  walk <- numeric(steps)
  for (s in seq_len(steps)) {
    walk[s] <- sample(directions, 1)
  }
  results[[i]] <- sum(walk)
}

We can rewrite this with one call to sample():

all.steps <- sample(directions, n*steps, replace=TRUE)
dim(all.steps) <- c(n, steps)
walks <- apply(all.steps, 1, sum)

Proof of speed increase (n=10000):

> system.time({
+ for (i in seq_len(n)) {
+   walk <- numeric(steps)
+   for (s in seq_len(steps)) {
+     walk[s] <- sample(directions, 1)
+   }
+   results[[i]] <- sum(walk)
+ }})
   user  system elapsed 
  4.231   0.332   4.758 

> system.time({
+ all.steps <- sample(directions, n*steps, replace=TRUE)
+ dim(all.steps) <- c(n, steps)
+ walks <- apply(all.steps, 1, sum)
+ })
   user  system elapsed 
  0.010   0.001   0.012

If your simulation needs just one random variable per simulation function call, use sapply(), or better yet the multicore package's mclapply(). Revolution Analytics's foreach package may be of use here too. Also, JD Long has a great presentation and post about simulating stuff in R on Hadoop via Amazon's EMR here (I can't find the video, but I'm sure someone will know).

Take home points:

Preallocate with numeric(n) or vector('list', n)
Push invariant code out of for loops. Cleverly push stochastic functions out of code with their n argument.
Try hard for sapply() or lapply(), or better yet mclapply.
Don't use x <- c(x, rnorm(100)). Every time you do this, a member of R-core kills a puppy.

Probably the best thing you can do is preallocate a list of length n (n is number of iterations) and flatten out the list after you're done.

n <- 10
start <- vector("list", n)
for (i in 1:n) {
  a[[i]] <- sample(10)
}
start <- unlist(start)

You could do it the old nasty way. This may be slow for larger vectors.

start <- c()
for (i in 1:n) {
  add <- sample(10)
  start <- c(start, add)
}

x <- rnorm(100)
for (i in 100) {
   x <- c(x, rnorm(100))
}

This link should be useful: http://www.milbo.users.sonic.net/ra/

Assuming your simulation function -- call it func -- returns a vector with the same length each time, you can store the results in the columns of a pre-allocated matrix:

sim1 <- function(reps, func) {
  first <- func()
  result <- matrix(first, nrow=length(first), ncol=reps)
  for (i in seq.int(from=2, to=reps - 1)) {
    result[, i] <- func()
  }
  return(as.vector(result))
}

Or you could express it as follows using replicate:

sim2 <- function(reps, func) {
  return(as.vector(replicate(reps, func(), simplify=TRUE)))
}

> sim2(3, function() 1:3)
[1] 1 2 3 1 2 3 1 2 3

继续阅读：r

vector binding in R

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？