R lag over missing data
Is there a variant of lag somewhere that keeps NAs in position? I want to compute returns of price data where data could be missing.
Col 1 is the price data Col 2 is the lag of price Col 3 shows p - lag(p) - the return from 99 to 104 is effectively missed, so the path length of the computed returns will differ from the true. Col 4 shows the lag with NA position preserved Col 5 shows the new difference - now the return of 5 for 2009-11-07 is available
Cheers, Dave
x <- xts(c(100, 101, 97, 95, 99, NA, 104, 103, 103, 100), as.Date("2009-11-01") + 0:9)
# fake the lag I want, with NA kept in position
x.pos.lag <- lag.xts(x.pos)
x.pos.lag <- lag.xts(x.pos)
x.pos.lag['2009-11-07']=99
x.pos.lag['开发者_StackOverflow社区2009-11-06']=NA
cbind(x, lag.xts(x), x - lag.xts(x), x.pos.lag, x-x.pos.lag)
..1 ..2 ..3 ..4 ..5
2009-11-01 100 NA NA NA NA
2009-11-02 101 100 1 100 1
2009-11-03 97 101 -4 101 -4
2009-11-04 95 97 -2 97 -2
2009-11-05 99 95 4 95 4
2009-11-06 NA 99 NA NA NA
2009-11-07 104 NA NA 99 5
2009-11-08 103 104 -1 104 -1
2009-11-09 103 103 0 103 0
2009-11-10 100 103 -3 103 -3
There are no functions to do that natively in R, but you can create an index of the original NA positions and then swap the values there after the lag.
x <- xts(c(100, 101, 97, 95, 99, NA, 104, 103, 103, 100), as.Date("2009-11-01") + 0:9)
lag.xts.na <- function(x, ...) {
na.idx <- which(is.na(x))
x2 <- lag.xts(x, ...)
x2[na.idx+1,] <- x2[na.idx,]
x2[na.idx,] <- NA
return(x2)
}
lag.xts.na(x)
[,1]
2009-11-01 NA
2009-11-02 100
2009-11-03 101
2009-11-04 97
2009-11-05 95
2009-11-06 NA
2009-11-07 99
2009-11-08 104
2009-11-09 103
2009-11-10 103
Incidentally, are you just trying to deal with weekends/holidays or something along that line? If so, you might consider dropping those positions from your series; that will dramatically simplify things for you. Alternatively, the timeSeries package in Rmetrics has a number of functions to deal with business days.
精彩评论