Is there a nice way to do an operation on all pairings of the columns of two data frames?
For example, given the data frames:
> df1
a b
1 1 3
2 2 4
and
> df2
x y z
1 10 12 14
2 11 13 15
and doing an addition operation on each pairing of columns from df1 and df2, I would like to produce:
> df3
ax bx ay by az bz
1 11 13 13 15 15 17
2 13 15 15 17 17 19
I wrote the following code which does the job, but I'm wondering if there is a nicer way to do it.
df1 <- data.frame(a=1:2, b=3:4)
df2 <- data.frame(x=10:11, y=12:13, z=14:15)
byColumnAdd开发者_Go百科itionAllPairs <- function(df1, df2) {
doOp <- function(x, df1, df2, pairs) {
i <- pairs[x,1]; # ith column of df1
j <- pairs[x,2]; # jth column of df2
# add paired columns
tmp <- df1[i] + df2[j];
# set new column name
names(tmp)[1] <- paste(names(df1)[i], names(df2)[j], sep="");
# return column
tmp
}
# generate column pairings
pairs <- expand.grid(1:length(df1), 1:length(df2))
# for each column pair, doOp
data.frame(sapply(1:nrow(pairs), doOp, df1, df2, pairs))
}
df3 <- byColumnAdditionAllPairs(df1, df2)
Thanks, Zach
Here's one way. It has elements of some of the other answers...
z <- outer(colnames(df1), colnames(df2), function(c1,c2) df1[,c1] + df2[,c2])
colnames(z) <- outer(colnames(df1), colnames(df2), paste, sep = '')
> z
ax bx ay by az bz
1 11 13 13 15 15 17
2 13 15 15 17 17 19
Get all the names of the pairs of columns using expand.grid
.
col_pairs <- expand.grid(colnames(df1), colnames(df2))
Now apply your addition function
col_sums <- apply(col_pairs, 1L, function(x) df1[, x["Var1"]] + df2[, x["Var2"]])
Fix up the column names
col_names <- apply(col_pairs, 1L, function(x) paste(x, collapse = ""))
colnames(col_sums) <- col_names
comb <- as.vector(outer(names(df1),names(df2),paste))
df3 <- data.frame(sapply(comb,function(x) df1[strsplit(x," ")[[1]][1]]+df2[strsplit(x," ")[[1]][2]]))
names(df3) <- gsub(" ","",comb)
Which gives:
> df3
ax bx ay by az bz
1 11 13 13 15 15 17
2 13 15 15 17 17 19
A slightly different approach is using outer()
:
df1 <- data.frame(a = 1:2, b = 3:4)
df2 <- data.frame(x = 10:11, y = 12:13, z = 14:15)
m1 <- data.matrix(df1)
m2 <- data.matrix(df2)
t(sapply(1:2, function(x, m1, m2) outer(m1[x,], m2[x,], "+"), m1 = m1, m2 = m2))
which gives:
> t(sapply(1:2, function(x, m1, m2) outer(m1[x,], m2[x,], "+"), m1 = m1, m2 = m2))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 11 13 13 15 15 17
[2,] 13 15 15 17 17 19
精彩评论