Scanning an array in R
I use R and I have a long numeric vector. I would like to look for all the maximal continuous subranges in this vector, where all values are lower then some threshold.
For example, if the given vector is
5 5 6 6 7 5 4 4 4 3 2 1 1 1 2 3 4 5 6 7 6 5 开发者_开发问答4 3 2 2 3 4 4
and my threshold is 4 (i.e., =<3
), then the values that meet this condition are marked with x:
0 0 0 0 0 0 0 0 0 x x x x x x x 0 0 0 0 0 0 0 x x x x 0 0
I would also like to return something like (10,16), (24,27)
. How do I do that?
To get the ranges you can use rle
First create the encoding
x <- c(5,5,6,6,7,5,4,4,4,3,2,1,1,1,2,3,4,5,6,7,6,5,4,3,2,2,3,4,4)
enc <- rle(x <= 3)
enc.endidx <- cumsum(enc$lengths) #ending indices
enc.startidx <- c(0, enc.endidx[1:(length(enc.endidx)-1)]) + 1 # starting indices
data.frame(startidx=enc.startidx[enc$values], endidx=enc.endidx[enc$values])
That should give you
startidx endidx
1 10 16
2 24 27
The answer to your first question is pretty straight forward:
x <- c(5,5,6,6,7,5,4,4,4,3,2,1,1,1,2,3,4,5,6,7,6,5,4,3,2,2,3,4,4)
y <- x<=3
y
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE
[13] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
[25] TRUE TRUE TRUE FALSE FALSE
as.numeric(y)
[1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0
to get the indices as you want them is more difficult.
You can try which
as proposed by whatnick.
Another possibility is to use match
. It returns the first element that matches. So match(1,y)
would return 10. match(0,y[10:length(y)]) - 1
would return 16. If you can put this into a while
-loop you could get the indices as you like.
The operator you need is "which". The syntax will be indices<-which(vector<=3). This will give you a list of indices where the value meets the condition. To isolate transitions you may use a diffrential of the indices. Where the differential exceeds 1 you have a transition boundary.
I needed to do this too and this is what I'm using:
ranges <- function(b){ # b must be boolean
b <- c(FALSE,b,FALSE)
d <- b[-1]-b[-length(b)]
return(data.frame(start=which(d==1),end=which(d==-1)-1))
}
In your example
x <- c(5,5,6,6,7,5,4,4,4,3,2,1,1,1,2,3,4,5,6,7,6,5,4,3,2,2,3,4,4)
ranges(x<=3)
produces
start end
1 10 16
2 24 27
精彩评论