Create a vector of all dates in a given year
Is there a simple R idiom for getti开发者_StackOverflowng a sequence of all days in a given year? I can do the following which does ok, except for leap years:
dtt <- as.Date( paste( as.character(year), "-1-1", sep="") ) + seq( 0,364 )
I could, obviously, add a line to filter out any values in (year + 1) but I'm guessing there's a much shorter way to do this.
What about this:
R> length(seq( as.Date("2004-01-01"), as.Date("2004-12-31"), by="+1 day"))
[1] 366
R> length(seq( as.Date("2005-01-01"), as.Date("2005-12-31"), by="+1 day"))
[1] 365
R>
This uses nuttin' but base R to compute correctly on dates to give you your vector. If you want higher-level operators, look e.g. at lubridate or even my more rudimentary RcppBDT which wraps parts of the Boost Time_Date library.
Using Dirk's guidance I've settled on this:
getDays <- function(year){
seq(as.Date(paste(year, "-01-01", sep="")), as.Date(paste(year, "-12-31", sep="")), by="+1 day")
}
I'd be interested to know if it would be faster to invert the sequencing and the casting as.Date
:
# My function getDays
getDays_1 <- function(year) {
d1 <- as.Date(paste(year, '-01-01', sep = ''));
d2 <- as.Date(paste(year, '-12-31', sep = ''));
as.Date(d1:d2, origin = '1970-01-01');
};
# other getDays
getDays_2 <- function(year) {
seq(as.Date(paste(year, '-01-01', sep='')),
as.Date(paste(year, '-12-31', sep='')),
by = '+1 day');
};
test_getDays_1 <- function(n = 10000) {
for(i in 1:n) {
getDays_1(2000);
};
};
test_getDays_2 <- function(n = 10000) {
for(i in 1:n) {
getDays_2(2000);
};
};
system.time(test_getDays_1());
# user system elapsed
# 4.80 0.00 4.81
system.time(test_getDays_2());
# user system elapsed
# 4.52 0.00 4.53
I guess not . . . it appears that sequencing Date
objects is slightly faster than convert a vector of integers to Date
s
I needed something similar, however for a range of dates I want to know the number of days in that year. I came up with the following function, which returns a vector with the same length as the dates in the input.
days_in_year <- function(dates) {
years <- year(dates)
days <- table(year(seq(as.Date(paste0(min(years), '-01-01')),
as.Date(paste0(max(years), '-12-31')),
by = '+1 day')))
as.vector(days[as.character(years)])
}
It works similar to Dirk's solution, however it uses the lubridate::year
function to get the year part of all dates twice. Using table does the same as length, however for all unique years. It might use some more memory than strictly necessary if the dates are not in consecutive years.
精彩评论