开发者

help with rle command

I'm having some tro开发者_开发百科uble with an rle command that is designed to find the point at which participants reach 8 contiguous ones in a row.

For example, if:

x <- c(0,1,0,1,1,1,1,1,1,1,1,1)

i want to return a value of 11.

Thanks to DWin to I've been using this piece of code:

which( rle(x2)$values==1 & rle(x2)$lengths >= 8)
sum(rle(x)$lengths[ 1:(min(which(rle(x)$lengths >= 8))-1) ]) + 8

I've been using this code successfully to process my data. However, i noticed that it made a mistake when processing one of my data files.

For example, if

 x <- c(1,1,1,1,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)

the code returns 19, which is the point at which eight contiguous zeros in a row is reached. i'm not sure what is going wrong or how it fix it.

thanks in advance for your help.

Will


You need to paste the first line of code in its entirety into the second:

sum(rle(x)$lengths[ 1:(min(which( rle(x2)$values==1 & rle(x2)$lengths >= 8))-1) ]) + 8
[1] 39

However, here is another approach, using the function filter. This yields the same result in what I consider to be much more readable code:

which(filter(x2, rep(1/8, 8), sides=1) == 1)[1]
[1] 39

The filter function when used in this way essentially computes a moving average over a block of 8 values in the vector. I then return the position of the first value where the moving average equals 1.


In the basic programming course I teach, I advise students to give proper names to subresults, and to inspect these subresults:

lengthOfrepeatsOfAnything<-rle(x)$lengths
#4  2  5 11  2  2  3  2 17
whichRepeatsAreOfOnes<-rle(x)$values==1
#1 3 5 7 9
repeatsOfOnesLength<-lengthOfrepeatsOfAnything * whichRepeatsAreOfOnes #TRUE = 1, FALSE=0
#4  0  5  0  2  0  3  0 17
whichRepeatOfOneAreLongerThanEight<-which(repeatsOfOnesLength >= 8)
#9
result<-NA
if(length(whichRepeatOfOneAreLongerThanEight)>0){
    firstRepeatOfOneAreLongerThanEight<-whichRepeatOfOneAreLongerThanEight[1]
    #9
    if(firstRepeatOfOneAreLongerThanEight==1){
        result<-8
    }
    else{
        repeatsBeforeFirstEightOnes<-1:(firstRepeatOfOneAreLongerThanEight-1)
        #1 2 3 4 5 6 7 8
        lengthsOfRepeatsBeforeFirstEightOnes<-lengthOfrepeatsOfAnything[repeatsBeforeFirstEightOnes]
        #4  2  5 11  2  2  3  2
        result<-sum(lengthsOfRepeatsBeforeFirstEightOnes) + 8
    }
}

I know it doesn't look as dandy as a oneline solution, but it helps to make things clear and to pick up errors... Besides: what if you look back at this code in 4 months? Which one will be easier to understand again?


My advice would be to break the code up into simpler pieces. As suggested by @Nick, you want to write code which can be easily debugged and modular coding allows you to do that.

# find runs of 0s and 1s
run_01 = rle(x)

# find run of 1's with length >=8
run_1 = with(run_01, which(values == 1 & lengths >=8))

# find starting position of run_1
start_pos = sum(run_01$lengths[1:(run_1 - 1)])

# add 8 to it
end_pos  = start_pos + 8
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜