computing missing months in timeseries
I have a dataset in R that contains monthly values. However, some months are missing. For example:
"2001-09-01" "2001-10-01" "2001-12-01" "2002-02-01"
Nov 2001 is missing and Jan 2002 is missing. How do I include those months into the timeseries and add a value of 0 ?
Th开发者_如何学Canks!
Since this is a monthly series it might make sense to represent it as a series with "yearmon"
class times. The first few lines set up the test data and the last two lines do the actual filling:
# set up input data as a zoo series
library(zoo)
d <- c("2001-09-01", "2001-10-01", "2001-12-01", "2002-02-01")
z <- zoo(1:4, as.yearmon(d))
# merge with zero width series
g <- seq(start(z), end(z), 1/12)
zz <- merge(z, zoo(, g), fill = 0)
If a "ts"
series is desired then use as.ts(zz)
or if a zoo series with times of "Date"
class is wanted then try: time(zz) <- as.Date(time(zz))
.
Note that this is also discussed with several examples in FAQ 13 of the zoo FAQ available via the R command vignette("zoo-faq")
or on the net at:
http://cran.r-project.org/web/packages/zoo/vignettes/zoo-faq.pdf
Assuming that you have your data in a data.frame
, called dat1
:
dat1 <- data.frame(
date = as.Date(c("2001-09-01", "2001-10-01", "2001-12-01", "2002-02-01")),
val = 1:4
)
You can then create a second data.frame
that contains a single column with all the dates you need. Use seq.Date
to create this sequence:
dat2 <- data.frame(
date = seq(as.Date("2001-09-01"), by="1 month", length.out=7)
)
Then it is a simple merge
operation:
merge(dat1, dat2, all=TRUE)
date val
1 2001-09-01 1
2 2001-10-01 2
3 2001-11-01 NA
4 2001-12-01 3
5 2002-01-01 NA
6 2002-02-01 4
7 2002-03-01 NA
The missing values are NA
but you can then use subsetting to set them to 0, if you desire.
精彩评论