开发者

apply() and calculating proportion of first row for all dataframe rows

I have a dataframe as shown below listing the number of injuries by vehicle type:

trqldnum <- data.frame(motorveh=c(796,912,908,880,941,966,989,984),
                       motorcyc=c(257,295,326,313,403,389,474,496),
                       bicyc=c(109,127,125,137,172,146,173,178))
trqldnum

#  motorveh motorcyc bicyc
#1      796      257   109
#2      912      295   127
#3      908      326   125
#4      880      313   137
#5      941      403   172
#6      966      389   146
#7      989      474   173
#8      984      496   178

At the moment I am calculating a proportion of the first row for each vehicle type using:

trqldprop <- t(apply(trqldnum,1,function(x) {
                 x/c(trqldnum[1,1],trqldnum[1,2],trqldnum[1,3])
              }))
trqldprop

#  motorveh motorcyc    bicyc
#1 1.000000 1.000000 1.000000
#2 1.145729 1.147860 1.16开发者_JAVA技巧5138
#3 1.140704 1.268482 1.146789
#4 1.105528 1.217899 1.256881
#5 1.182161 1.568093 1.577982
#6 1.213568 1.513619 1.339450
#7 1.242462 1.844358 1.587156
#8 1.236181 1.929961 1.633028

This seems a bit ugly and I would need to manually change the denominator of the function if the data changed shape. I end up with the output in a list of a lists if I try to just use the following within the apply() statement.

function(x) x/c(trqldnum[1,])

I'd prefer to end up with the dataframe result as above but am just getting in a muddle trying to figure it out.


Convert the data-frame to a matrix and use matrix operations:

m <- as.matrix(trqldnum)

trqldprop <- as.data.frame( t(t(m)/m[1,]) )

> trqldprop
  motorveh motorcyc    bicyc
1 1.000000 1.000000 1.000000
2 1.145729 1.147860 1.165138
3 1.140704 1.268482 1.146789
4 1.105528 1.217899 1.256881
5 1.182161 1.568093 1.577982
6 1.213568 1.513619 1.339450
7 1.242462 1.844358 1.587156
8 1.236181 1.929961 1.633028

Note that we need to transpose the matrix (see the t(m)) because when you divide a matrix by a vector, the operation is done column-wise.


I like the plyr for these tasks as they allow you to specify the format of the output. You can turn this into a function that will scale to more columns and different base levels for the division easily.

FUN <- function(dat, baseRow = 1){
    require(plyr)   
    divisors <- dat[baseRow ,]
    adply(dat, 1, function(x) x / divisors)
}

FUN(trqldnum, 1)

  motorveh motorcyc    bicyc
1 1.000000 1.000000 1.000000
2 1.145729 1.147860 1.165138
3 1.140704 1.268482 1.146789
4 1.105528 1.217899 1.256881
5 1.182161 1.568093 1.577982
6 1.213568 1.513619 1.339450
7 1.242462 1.844358 1.587156
8 1.236181 1.929961 1.633028


How about

sweep(trqldnum,2,unlist(trqldnum[1,]),"/")

?

The unlist is required to convert the first row of the data frame into a vector that can be swept ...


Some version of Prasad solution without conversion to matrix.

trqldnum/trqldnum[1,][rep(1,nrow(trqldnum)),]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜