开发者

Aggregate R sum

I'm writting my first program in R and as a newbie I'm having some troubles, hope you can help me.

I've got a data frame like this:

> v1<-c(1,1,2,3,3,3,4)
> v2<-c(13,5,15,1,2,7,4)
> v3<-c(0,3,6,13,8,23,5)
> v4<-c(26,25,11,2,8,1,0)
> datos<-data.frame(v1,v2,v3,v4)
> names(datos)<-c("Position","a1","a2","a3")

> datos
  posicion a1 a2 a3
1        1 13  0 26
开发者_Go百科2        1  5  3 25
3        2 15  6 11
4        3  1 13  2
5        3  2  8  8
6        3  7 23  1
7        4  4  5  0

What I need is to sum the data in a1, a2 and a3 (in my real case from a1 to a51) grouped by Position. I'm trying with the function aggregate() but it only works for means, not for sums and I don't know why.

Thanks in advance


You need to tell the aggregate function to use sum, as the default is for it to get the mean of each category. For example:

aggregate(datos[,c("a1","a2","a3")], by=list(datos$Position), "sum")


This is fairly straightforward with the plyr library.

library("plyr")
ddply(datos, .(Position), colwise(sum))

If you have additional non-numeric columns that shouldn't be averaged, you can use

ddply(datos, .(Position), numcolwise(sum))


ag_df <-- aggregate(.~Position,data=datos,sum)

should give you a data frame containing the sums of the "a" values for each of the positions. The trick here is the . in the formula represents a list of all the "non-grouping" variables in the formula.

Note that you can get much the same result with:

sumdf <- rowsum(datos,datos$Position,na.rm=T)

Except that includes the sums of the positions as well!

If you DON'T want all non-group columns aggregated, you can use cbind as in:

sumdf1 <- aggregate(cbind(a1,a3)~datos$Position,datos,sum)

That sums only the a1 and a3 columns.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜