Operate on pairs of rows of a data frame
I've got a data frame in R, and I'd like to perform a calculation on all pairs of rows. Is there a simpler way to do this than using a nested for loop?
To make this c开发者_StackOverflow中文版oncrete, consider a data frame with ten rows, and I want to calculate the difference of scores between all (45) possible pairs.
> data.frame(ID=1:10,Score=4*10:1)
ID Score
1 1 40
2 2 36
3 3 32
4 4 28
5 5 24
6 6 20
7 7 16
8 8 12
9 9 8
10 10 4
I know I could do this calculation with a nested for loop, but is there a better (more R-ish) way to do it?
To calculate the differences, perhaps you could use
outer(df$Score,df$Score,"-")
Here another solution using combn
:
df <- data.frame(ID=1:10,Score=4*10:1)
cm <- combn(df$ID,2)
delta <- df$Score[cm[1,]]-df$Score[cm[2,]]
or more directly
df <- data.frame(ID=1:10,Score=4*10:1)
delta <- combn(df$ID,2,function(x) df$Score[x[1]]-df$Score[x[2]])
colmx = matrix(rep(df[,2], 10), ncol=10, byrow=F)
rowmx = matrix(rep(df[,2], 10), ncol=10, byrow=T)
delta = colmx - rowmx
dist() is your friend.
dist(df$Score)
You can put it as a matrix :
as.matrix( dist(df$Score) )
精彩评论