开发者

How do I add values of certain column variables if they apprear in repeated rows with same key values

I am reading a csv file with customer buys for each week over a movie's viewing window on the video on demand. The file consiste of several columns for a given movie, including title, price, week numer, and number of buys in that occur in that week. Sometimes, a specific movie may have the buys for a given week in mul开发者_如何学运维tiple rows (for the same week).

I am trying to manipulate the data so that if there are multiole rows for the same week for a given movie, i want to replace it with one row for the week and the buys added up.

I have just started to look at R last few weeks and have failed miserabley in tis attempt. Please help!!!


I think I followed what you wanted. Let me know if this is what you had in mind.

First, make up some data:

movies <- data.frame(mov = c("Foo", rep(c("Foo", "Bar", "Baz"),3))
    , price = c(7, rep(c(7, 5, 2),3))
    , wknumb = c(1, rep(1:3, each = 3))
    , buys = sample(10:20, 10, TRUE))

   mov price wknumb buys
1  Foo     7      1   12
2  Foo     7      1   11
3  Bar     5      1   10
4  Baz     2      1   19
5  Foo     7      2   10
6  Bar     5      2   15
7  Baz     2      2   15
8  Foo     7      3   14
9  Bar     5      3   15
10 Baz     2      3   20

We have two entries for movie "Foo" and week 1 which we need to collapse into a single entry. We can use the plyr package for this:

library(plyr)
ddply(movies, c("mov", "wknumb"), summarize, totbuys = sum(buys))

Which returns a data.frame with 9 rows, one for each movie and week:

  mov wknumb totbuys
1 Bar      1      10
2 Bar      2      15
3 Bar      3      15
4 Baz      1      19
5 Baz      2      15
6 Baz      3      20
7 Foo      1      23
8 Foo      2      10
9 Foo      3      14


A variation using aggregate from base R (and Chase's dataset).

with(movies, aggregate(buys, list(mov = mov, wknumb = wknumb), sum))


Many of these can be handled by sqldf if you are familiar with sql.

library(sqldf)
sqldf("select mov, wknumb, sum(buys) as sumBuys from movies group  by wknumb, mov")
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜