R: loop to merge dataframes and cbind these
I have a question like [] R: merge unequal dataframes and replace missing rows with 0
Here is the data for this question:
df1 = data.frame(x=c('a', 'b', 'c', 'd', 'e'))
df2 = data.frame(x=c('a', 'b', 'c'),y = c(0,1,0))
df3 = data.frame(x=c('a', 'b', 'c', 'd'),y = c(1,1,1,0))
df4 = data.frame(x=c('b', 'a', 'e'),y = c(0,1,0))
zz <- merge(df1, df2, all = TRUE)
zz[is.na(zz)] <- 0
In this example I merged df1 with df2. Now I want to create a loop to merged df3 and df4 and more dataframes with df1. The problem is that the results in the list must be cbind to generate a final dataframe.
Can anyone help me?
Thanks!
EDIT! This is the loop I created. The variable goterms contrains a list with 10 variables. The variables are the names of the lists in the interestedGO. The first interestedGO is selected, and the result of calculation is the variable result. This result must be merged with x. Because it is a loop, all the 10 results must be cbind to create a final dataframe.
for (i in 1:length(goterms)){
goilmn<-as.data.frame(interestedGO[i])
resultILMN<-match(goilmn[,1], rownames(xx2),nomatch=0)
resultILMN[resultILMN] <- 1
result<-cbind(goilmn,resultILMN)
colnames(result) <- c('x','result')
zz<-merge(x, result, all=TRUE)
resultloop<-zz[is.na(zz)]<-0
standard[i]<-cbind(resultloop)
}
goterms:
[1] "GO:0009611" "GO:0007596" "GO:0050817" "GO:0061082" "GO:0007599"
[6] "GO:0050776" "GO:0006910" "GO:0034383" "GO:0019932" "GO:0002720"
interestedGO:
$`GO:0009611`
[1] "ILMN_1651346" "ILMN_1651354" "ILMN_1651599" "ILMN_1651950" "ILMN_1652287"
[6] "ILMN_1652445" "ILMN_16526开发者_如何学Go93" "ILMN_1652825" "ILMN_1653324" "ILMN_1653395"
$`GO:0007596`
[1] "ILMN_1651599" "ILMN_1652693" "ILMN_1652825" "ILMN_1653324" "ILMN_1655595"
[6] "ILMN_1656057" "ILMN_1659077" "ILMN_1659923" "ILMN_1659947" "ILMN_1662619"
[11] "ILMN_1664565" "ILMN_1665132" "ILMN_1665859" "ILMN_1666175" "ILMN_1668052"
[16] "ILMN_1670229" "ILMN_1670305" "ILMN_1670490" "ILMN_1670708"
"ILMN_1671766"
$`GO:0050817`
[1] "ILMN_1651599" "ILMN_1652693" "ILMN_1652825" "ILMN_1653324" "ILMN_1655595"
[6] "ILMN_1656057" "ILMN_1659077" "ILMN_1659923" "ILMN_1659947" "ILMN_1662619"
[11] "ILMN_1664565" "ILMN_1665132" "ILMN_1665859" "ILMN_1666175" "ILMN_1668052"
[16] "ILMN_1670229" "ILMN_1670305" "ILMN_1670490" "ILMN_1670708" "ILMN_1671766"
[21] "ILMN_1671928" "ILMN_1675083" "ILMN_1678049" "ILMN_1678728"
"ILMN_1680805"
$`GO:0061082`
[1] "ILMN_1661695" "ILMN_1665132" "ILMN_1716446" "ILMN_1737314" "ILMN_1772387"
[6] "ILMN_1784863" "ILMN_1796094" "ILMN_1800317" "ILMN_1800512" "ILMN_1807074"
x is a reference of all ILMN code. Here is an head of the x variable. x[1:100,]
[1] ILMN_1343291 ILMN_1343295 ILMN_1651228 ILMN_1651229 ILMN_1651238
[6] ILMN_1651254 ILMN_1651259 ILMN_1651260 ILMN_1651262 ILMN_1651278
[11] ILMN_1651282 ILMN_1651285 ILMN_1651286 ILMN_1651303 ILMN_1651310
[16] ILMN_1651315 ILMN_1651330 ILMN_1651336 ILMN_1651343 ILMN_1651346
[21] ILMN_1651347 ILMN_1651354 ILMN_1651358 ILMN_1651370 ILMN_1651373
[26] ILMN_1651385 ILMN_1651396 ILMN_1651415 ILMN_1651428 ILMN_1651430
[31] ILMN_1651433 ILMN_1651437 ILMN_1651438 ILMN_1651456 ILMN_1651457
I'm not sure if I understand correctly what you want, but like this?
> zz <- Reduce(function(a,b)merge(a,b,all=TRUE, by="x"), list(df1, df2, df3, df4))
> zz[is.na(zz)] <- 0
> zz
x y.x y.y y
1 a 0 1 1
2 b 1 1 0
3 c 0 1 0
4 d 0 0 0
5 e 0 0 0
You can avoid loop by using Reduce, but note that it does not necessarily lead to performance improvement.
If you want separate dataframes, then Map (just the wrapper of mapply) is useful:
> zz <- Map(function(b)merge(df1,b,all=TRUE, by="x"), list(df2, df3, df4))
> zz
[[1]]
x y
1 a 0
2 b 1
3 c 0
4 d NA
5 e NA
[[2]]
x y
1 a 1
2 b 1
3 c 1
4 d 0
5 e NA
[[3]]
x y
1 a 1
2 b 0
3 c NA
4 d NA
5 e 0
and cbind them by do.call
> zz <- do.call("cbind", zz)
> zz[is.na(zz)] <- 0
> zz
x y x y x y
1 a 0 a 1 a 1
2 b 1 b 1 b 0
3 c 0 c 1 c 0
4 d 0 d 0 d 0
5 e 0 e 0 e 0
精彩评论