开发者

xts merge odd behaviour

I have 3 xts objects all of whose indicies are 'Date' objects:

    > a
                  a
    1995-01-03 1.76
    1995-01-04 1.69
    > b
                  b
    1995-01-03 1.67
    1995-01-04 1.63
    > c
                   c
    1995-01-03 1.795
    1995-01-04 1.690

To verify the indices are the same:

    > index(a) == index(b)
    [1] TRUE TRUE
    > index(a) == index(c)
    [1] TRUE TRUE

Now I'm seeing this odd behaviour:

    > merge.xts(a,b)
                  a    b
    1995-01-03   NA 1.67
    1995-01-03 1.76   NA
    1995-01-04   NA 1.63
    1995-01-04 1.69   NA

While the following merge works fine:

    > merge.xts(a,c)
                  a     c
    1995-01-03 1.76 1.795
    1995-01-04 1.69 1.690

I can't figure out what could possibly be going on here. Any idea?开发者_运维知识库

Update:

    > dput(a)
    structure(c(1.76, 1.69), .indexCLASS = "Date", .indexTZ = "", .CLASS = "xts", class = c("xts", 
    "zoo"), index = structure(c(789168240, 789254580), tzone = "", tclass = "Date"), .Dim = c(2L, 
    1L), .Dimnames = list(NULL, "a"))

    > dput(b)
    structure(c(1.67, 1.63), .indexCLASS = "Date", .indexTZ = "", .CLASS = "xts", class = c("xts", 
    "zoo"), index = c(789109200, 789195600), .Dim = c(2L, 1L), .Dimnames = list(
        NULL, "b"))

    > dput(c)
    structure(c(1.795, 1.69), .indexCLASS = "Date", .indexTZ = "", .CLASS = "xts", class = c("xts", 
    "zoo"), index = c(789109200, 789195600), .Dim = c(2L, 1L), .Dimnames = list(
        NULL, "c"))

Indeed, the problem is that the indices are not identical (as verified by .index(a) == .index(b) ). Converting to numeric then recreating an xts and recomputing the dates with asDate fixed the problem.

This objects were created from the to.daily method from xts.


This seems convoluted of course, but the reason is that Date is imprecise. Anything within a calendar day is the same "date".

Somewhere, as Josh alluded to, is the fact that your data are created in different ways/sources. I'll try and think of a better way to manage this variability - since it isn't a purely novel issue. Until then:

index(x) <- index(x) 

will do the trick. Why?

As Josh notes, index(x) [no <- ] takes an underlying POSIX time_t representation and converts it into a Date (days since the epoch). Replacing the original index via index<- converts the "Date" back into POSIX time (POSIXct in R, time_t in C)

 t1 <- Sys.time()
 t2 <- Sys.time()

 as.Date(t1) == as.Date(t2)
 #[1] TRUE

 t1 == t2
 #[1] FALSE


 x1 <- xts(1, t1)
 x2 <- xts(2, t2)


 indexClass(x1) <- "Date"
 indexClass(x2) <- "Date"
 cbind(x1,x2)
            ..1 ..2
 2011-10-06   1  NA
 2011-10-06  NA   2

 .index(cbind(x1,x2))
 [1] 1317925443 1317925447
 attr(,"tzone")
 [1] "America/Chicago"
 attr(,"tclass")
 [1] "Date"

 # ugly, ugly solution
 index(x1) <- index(x1)
 index(x2) <- index(x2)
 cbind(x1,x2)
            ..1 ..2
 2011-10-06   1   2
 .index(cbind(x1,x2))
 [1] 1317877200


You can't use index() to verify the indices are the same because it converts the index to whatever is specified by indexClass(). Use .index to get the raw numeric index, then you will likely find that:

all(.index(a) == .index(b))  # is FALSE

You should investigate your data source to see what might be causing this. For a quick fix, do this:

index(b) <- as.Date(index(b))
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜