Integration sampled data in R
I have some measuring data sampled over time and want to integrate it, the test dataset contains ~100000 samples (~100s, 1000Hz) of data.
My first approach was (table
contains the timestamp (0..100s) and the value of each data point (both double
s))
# test dataset available (gzipped, 720k) here: http://tux4u.de/so.rtab.gz
table <- read.table("/tmp/so.rtab", header=TRUE)
time <- table$t
data <- table$val
start <- min(time)
stop <- max(time)
sampling_rate <- 1000
divs <- (max(time) - min(time)) * sampling_rate
data_fun <- approxfun(time, data, method="linear", 0, 0)
result <- integrate(data_fun, start, stop, subdivisions=divs)
but somehow the integration runs forever (like an endless loop and eats up one CPU completely). So I looked at the values:
开发者_JS百科> start
[1] 0
> stop
[1] 98.99908
> divs
[1] 98999.08
The strange thing is that when I evaluate
> integrate(data_fun, 0, 98, subdivisions=100000)$value + integrate(data_fun, 98, 99)$value
[1] 2.640055
it works (computation time <3s) but the following evaluation (should be the same)
> integrate(data_fun, 0, 99, subdivisions=100000)$value
never terminates, too. And even this one (which is in fact a SUBintegral of the one working above) does NOT terminate:
> integrate(data_fun, 0, 89, subdivisions=100000)$value
It seems a bit random to me when it works and when it doesn't. Am I doing anything wrong or could I improve the process somehow?
Thanks!
(HINT: the sampling points are not necessarily distributed equally)
Ekhem, you know that you may just sum it up? cumsum
will do this fast:
cumsum(table$val)*diff(table$t)[1]
For unequal differences, you may use:
cumsum(table$val[-nrow(table)]*diff(table$t))
There is no need of more complex numerics since the data in this case is very densly sampled; nevertheless there will be always better methods than going through interpolator.
精彩评论