How to consolidate multiple data series in R
I'm doing some statistical processing of experimental data using R.
I have multiple files, each one with identical structure. Each row of each file has measurements made at the same time, for different dates, so that the general structure is like this:
time C1 C2 C3
19:00 200 10.0 30
19:01 220 10.0 45
...
What I need is to create a file with a summary of the values of one column from multiple files, so I will have, for example, the average and stdev of C2 at each time, along consecutive days.
time avg dev
19:00 205.0 30.0
19:0开发者_如何学JAVA1 220.0 10.0
...
There are a number of questions in Stack Overflow that can help you out. Try searching with "[r] multiple files" (omit the quotes). The [r] limits the search to only questions tagged r.
Here's a question that might get at what you are needing
and here's an example of the search
Create Files
, a vector of file names assuming the file names are of the indicated form or otherwise. Then read these files in, lapplying read.table
to each name and rbinding the results together giving m
which contains all rows of all tables. Finally aggregate
the m
data frame.
Files <- Sys.glob("test_*.txt")
m <- do.call(rbind, lapply(Files, read.table, header = TRUE))
aggregate(m[-1], m[1], function(x) c(mean = mean(x), sd = sd(x)))
library(plyr)
# Combine all the data
data=rbind(data1,data2,data3)
# to get the mean
ddply(data,.(time),numcolwise(mean))
# to get the sd
ddply(data,.(time),numcolwise(sd))
# You can combine both statements above into a single call and put the output into a data frame
resulting_data=data.frame(ddply(data,.(time),numcolwise(mean)),ddply(data,.(time),numcolwise(sd))[,-1])
# depending on the number of columns you have, name the output accordingly. For your example
names(resulting_data)c=('time','C1'..)
精彩评论