开发者

Creating a delta column to plot time series differences in R

I have a set of motorsport laptime data (mld) of the form:

  car lap laptime
1  1   1 138.523
2  1   2 122.373
3  1   3 121.395
4  1   4 137.871

and I want to produce something of the form:

  lap  car.1    car.1.delta   
1  1   138       NA
2  2   122       -16  
3  3   121       -1  
4  4   开发者_如何学Python127       6

I can use the R command diff(mld$laptime, lag=1) to produce the difference column, but how do I elegantly create the padded difference column in R?


Here are a couple of approaches:

1) zoo

If we represented this as a time series using zoo then the calculation would be particularly simple:

# test data with two cars

Lines <- "car lap laptime
1   1 138.523
1   2 122.373
1   3 121.395
1   4 137.871
2   1 138.523
2   2 122.373
2   3 121.395
2   4 137.871"
cat(Lines, "\n", file = "data.txt")

# read it into a zoo series, splitting it
# on car to give wide form (rather than long form)

library(zoo)
z <- read.zoo("data.txt", header = TRUE, split = 1, index = 2, FUN = as.numeric)

# now that its in the right form its simple

zz <- cbind(z, diff(z))

The last statement gives:

> zz
      1.z     2.z 1.diff(z) 2.diff(z)
1 138.523 138.523        NA        NA
2 122.373 122.373   -16.150   -16.150
3 121.395 121.395    -0.978    -0.978
4 137.871 137.871    16.476    16.476

To plot zz, one column per panel, try this:

plot(zz, type = "o")

To only plot the differences we do not really need zz in the first place as this will do:

plot(diff(z), type = "o")

(Add the screen=1 argument to the plot command to plot everything on the same panel.)

2) ave. Here is a second solution that uses just plain R (except for the plotting) and keeps the output in long form; however, it is a bit more complex:

# assume same input as above

DF <- read.table("data.txt", header = TRUE)
DF$diff <- ave(DF$laptime, DF$car, FUN = function(x) c(NA, diff(x)))

The result is:

> DF
  car lap laptime    diff
1   1   1 138.523      NA
2   1   2 122.373 -16.150
3   1   3 121.395  -0.978
4   1   4 137.871  16.476
5   2   1 138.523      NA
6   2   2 122.373 -16.150
7   2   3 121.395  -0.978
8   2   4 137.871  16.476

To plot just the differences, one per panel, try this:

library(lattice)
xyplot(diff ~ lap | car, DF, type = "o")

Update

Added info above on plotting since the title of the question mentions this.


I think this is enough:

mld$car.1.delta = c(NA, diff(mld$laptime, lag = 1))

In your example you have truncated laptimes but rounded car.1.delta, so if you really depends on how you want that to work, but code below gives what you posted.

Wrap everything in with to simplify, and create a new data.frame based on modifications of the existing columns. Prepend an NA to the diff to pad it out.

with(mld, 
    data.frame(
        lap = lap, 
    car.1 = trunc(laptime), 
    car.1.delta = c(NA, round(diff(laptime)))
)
)

  lap car.1 car.1.delta
1   1   138          NA
2   2   122         -16
3   3   121          -1
4   4   137          16

I wonder if you want to do this by car, and if so it will need a bit more handling but since you've literally asked for column car.1 I think this works so far as that goes.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜