In R: Indexing vectors by boolean comparison of a value in range: index==c(min : max)
In R, let's say we have a vector
area = c(rep(c(26:30), 5), rep(c(500:504), 5), rep(c(550:554), 5), rep(c(76:80), 5))
and another vector yield = c(1:100)
.
Now, say I want to index like so:
> yield[area==27]
[1] 2 7 12 17 22
> yield[area==501]
[1] 27 32 37 42 47
No problem, right? But weird things start happening when I try to开发者_JAVA百科 index it by using c(A, B)
. (and even weirder when I try c(min:max)
...)
> yield[area==c(27,501)]
[1] 7 17 32 42
What I'm expecting is of course the instances that are present in both of the other examples, not just some weird combination of them. This works when I can use the pipe OR operator:
> yield[area==27 | area==501]
[1] 2 7 12 17 22 27 32 37 42 47
But what if I'm working with a range? Say I want index it by the range c(27:503)
? In my real example there are a lot more data points and ranges, so it makes more sense, please don't suggest I do it by hand, which would essentially mean:
yield[area==27 | area==28 | area==29 | ... | area==303 | ... | area==500 | area==501]
There must be a better way...
You want to use %in%
. Also notice that c(27:503)
and 27:503
yield the same object.
> yield[area %in% 27:503]
[1] 2 3 4 5 7 8 9 10 12 13 14 15 17
[14] 18 19 20 22 23 24 25 26 27 28 29 31 32
[27] 33 34 36 37 38 39 41 42 43 44 46 47 48
[40] 49 76 77 78 79 80 81 82 83 84 85 86 87
[53] 88 89 90 91 92 93 94 95 96 97 98 99 100
Why not use subset?
subset(yield, area > 26 & area < 504) ## for indexes
subset(area, area > 26 & area < 504) ## for values
精彩评论