开发者

Create a vector of all dates in a given year

Is there a simple R idiom for getti开发者_StackOverflowng a sequence of all days in a given year? I can do the following which does ok, except for leap years:

dtt <- as.Date( paste( as.character(year), "-1-1", sep="") ) + seq( 0,364 )

I could, obviously, add a line to filter out any values in (year + 1) but I'm guessing there's a much shorter way to do this.


What about this:

R> length(seq( as.Date("2004-01-01"), as.Date("2004-12-31"), by="+1 day"))
[1] 366
R> length(seq( as.Date("2005-01-01"), as.Date("2005-12-31"), by="+1 day"))
[1] 365
R> 

This uses nuttin' but base R to compute correctly on dates to give you your vector. If you want higher-level operators, look e.g. at lubridate or even my more rudimentary RcppBDT which wraps parts of the Boost Time_Date library.


Using Dirk's guidance I've settled on this:

getDays <- function(year){
     seq(as.Date(paste(year, "-01-01", sep="")), as.Date(paste(year, "-12-31", sep="")), by="+1 day")
}


I'd be interested to know if it would be faster to invert the sequencing and the casting as.Date:

# My function getDays
getDays_1 <- function(year) {
  d1 <- as.Date(paste(year, '-01-01', sep = ''));
  d2 <- as.Date(paste(year, '-12-31', sep = ''));
  as.Date(d1:d2, origin = '1970-01-01');
};

# other getDays
getDays_2 <- function(year) {      
  seq(as.Date(paste(year, '-01-01', sep='')), 
      as.Date(paste(year, '-12-31', sep='')), 
      by = '+1 day');
};

test_getDays_1 <- function(n = 10000) {
  for(i in 1:n) {
    getDays_1(2000);
  };
};

test_getDays_2 <- function(n = 10000) {
  for(i in 1:n) {
    getDays_2(2000);
  };
};

system.time(test_getDays_1());
# user  system elapsed 
# 4.80    0.00    4.81 

system.time(test_getDays_2());
# user  system elapsed 
# 4.52    0.00    4.53 

I guess not . . . it appears that sequencing Date objects is slightly faster than convert a vector of integers to Dates


I needed something similar, however for a range of dates I want to know the number of days in that year. I came up with the following function, which returns a vector with the same length as the dates in the input.

days_in_year <- function(dates) {
    years <- year(dates)
    days <- table(year(seq(as.Date(paste0(min(years), '-01-01')),
                           as.Date(paste0(max(years), '-12-31')),
                           by = '+1 day')))
    as.vector(days[as.character(years)])
}

It works similar to Dirk's solution, however it uses the lubridate::year function to get the year part of all dates twice. Using table does the same as length, however for all unique years. It might use some more memory than strictly necessary if the dates are not in consecutive years.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜