开发者

Select values from vector using Date as index

Suppose I have a named vector, bar:

bar=c()
bar["1997-10-14"]=1
bar["2001-10-14"]=2
bar["2007-10-14"]=1

How can I select from bar all values for which the index is within a specific date range? So, if I look for all values between "1995-01-01" and "2000-06-01", I should get 1. And similarly for the开发者_运维百科 period between "2001-09-01" and "2007-11-04", I should get 2 and 1.


This problem has been solved for good with the xts package which extends functionality from the zoo package.

R> library(xts)
Loading required package: zoo
R> bar <- xts(1:3, order.by=as.Date("2001-01-01")+365*0:2)
R> bar
           [,1]
2001-01-01    1
2002-01-01    2
2003-01-01    3
R> bar["2002::"]        ## open range with a start year
           [,1]
2002-01-01    2
2003-01-01    3
R> bar["::2002"]        ## or end year
           [,1]
2001-01-01    1
2002-01-01    2
R> bar["2002-01-01"]    ## or hits a particular date
           [,1]
2002-01-01    2
R> 

There is a lot more here -- but the basic point is do not operate on strings masquerading as dates.

Use a Date type, or preferably even an extension package built to efficiently index on millions of dates.


You need to convert your dates from characters into a Date type with as.Date() (or a POSIX type if you have more information like the time of day). Then you can make comparisons with standard relational operators such as <= and >=.

You should consider using a timeseries package such as zoo for this.

Edit:

Just to respond to your comment, here's an example of using dates with your existing vector:

> as.Date(names(bar)) < as.Date("2001-10-14")
[1]  TRUE FALSE FALSE
> bar[as.Date(names(bar)) < as.Date("2001-10-14")]
1997-10-14 
         1

Although you really should just use a time series package. Here's how you could do this with zoo (or xts, timeSeries, fts, etc.):

library(zoo)
ts <- zoo(c(1, 2, 1), as.Date(c("1997-10-14", "2001-10-14", "2007-10-14")))
ts[index(ts) < as.Date("2001-10-14"),]

Since the index is now a Date type, you can make as many comparisons as you want. Read the zoo vignette for more information.


Using fact that dates are in lexical order:

bar[names(bar) > "1995-01-01" & names(bar) < "2000-06-01"]
# 1997-10-14 
#          1 

bar[names(bar) > "2001-09-01" & names(bar) < "2007-11-04"]
# 2001-10-14 2007-10-14 
#          2          1 

Result is named vector (as you original bar, it's not a list it's named vector).

As Dirk states in his answer it's better to use Date for efficiency reasons. Without external packages you could rearrange you data and create two vectors (or two-column data.frame) one for dates, one for values:

bar_dates <- as.Date(c("1997-10-14", "2001-10-14", "2007-10-14"))
bar_values <- c(1,2,1)

then use simple indexing:

bar_values[bar_dates > as.Date("1995-01-01") & bar_dates < as.Date("2000-06-01")]
# [1] 1

bar_values[bar_dates > as.Date("2001-09-01") & bar_dates < as.Date("2007-11-04")]
# [1] 2 1
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜