For-loop vs while loop in R

2023-01-25 06:53 问答作者：

I have noticed a curious thing whilst working in R. When I have a simple program that computes squares from 1 to N implemented using for-loop and while-loop the behaviour is not the same. (I don't care about vectorisation in this case or apply functions).

fn1 <- function (N) 
{
    for(i in 1:N) {
        y <- i*i
    }
}

AND

fn2 <- function (N) 
{
    i=1
    while(i <= N) {
        y <- i*i
        i <- i + 1
    }
}

The results are:

system.time(fn1(60000))
   user  system elapsed 
  2.500   0.012   2.493 
There were 50 or more warnings (use warnings() to see the first 50)
Warning messages:
1: In i * i : NAs 开发者_如何学Pythonproduced by integer overflow
.
.
.

system.time(fn2(60000))
   user  system elapsed 
  0.138   0.000   0.137

Now we know that for-loop is faster, my guess is because of pre allocation and optimisations there. But why does it overflow?

UPDATE: So now trying another way with vectors:

fn3 <- function (N) 
{
    i <- 1:N
    y <- i*i
}
system.time(fn3(60000))
   user  system elapsed 
  0.008   0.000   0.009 
Warning message:
In i * i : NAs produced by integer overflow

So Perhaps its a funky memory issue? I am running on OS X with 4Gb of memory and all default settings in R. This happens in 32- and 64-bit versions (except that times are faster).

Alex

Because 1 is numeric, but not integer (i.e. it's a floating point number), and 1:6000 is numeric and integer.

> print(class(1))
[1] "numeric"
> print(class(1:60000))
[1] "integer"

60000 squared is 3.6 billion, which is NOT representable in signed 32-bit integer, hence you get an overflow error:

> as.integer(60000)*as.integer(60000)
[1] NA
Warning message:
In as.integer(60000) * as.integer(60000) : NAs produced by integer overflow

3.6 billion is easily representable in floating point, however:

> as.single(60000)*as.single(60000)
[1] 3.6e+09

To fix your for code, convert to a floating point representation:

function (N)
{
    for(i in as.single(1:N)) {
        y <- i*i
    }
}

The variable in the for loop is an integer sequence, and so eventually you do this:

> y=as.integer(60000)*as.integer(60000)
Warning message:
In as.integer(60000) * as.integer(60000) : NAs produced by integer overflow

whereas in the while loop you are creating a floating point number.

Its also the reason these things are different:

> seq(0,2,1)
[1] 0 1 2
> seq(0,2)
[1] 0 1 2

Don't believe me?

> identical(seq(0,2),seq(0,2,1))
[1] FALSE

because:

> is.integer(seq(0,2))
[1] TRUE
> is.integer(seq(0,2,1))
[1] FALSE

And about timing:

fn1 <- function (N) {
    for(i in as.numeric(1:N)) { y <- i*i }
}
fn2 <- function (N) {
    i=1
    while (i <= N) {
        y <- i*i
        i <- i + 1
    }
}

system.time(fn1(60000))
# user  system elapsed 
# 0.06    0.00    0.07 
system.time(fn2(60000))
# user  system elapsed 
# 0.12    0.00    0.13

And now we know that for-loop is faster than while-loop. You cannot ignore warnings during timing.

继续阅读：floating-point for-loop r while-loop

For-loop vs while loop in R

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？