开发者

R: calculating column sums & row sums as an aggregation from a dataframe

I am trying to obtain column sums & row sums by aggregating with previous row sums & column sums.

For eg.

My initial data frame is:

      Flag1   Flag2  Flag3   Type1 Type2 Type3
1     Level1    A    FIRST     2     0    0
2     Level1    A    SECOND    1     9    0
3     Level1    A    THIRD     3     7    0
4     Level1    A    FOURTH    9    18    0
5     Level1    A    FIFTH     1    22    0
6     Level1    A    SIXTH     1    13    0
7     Level1    B    FIRST     0     0    0
8     Level1    B    SECOND    3     9    0
9     Level1    B    THIRD     5    85    0
10    Level1    B    FOURTH    4    96    0
11    Level1    B    FIFTH     3    40    0
12    Level1    B    SIXTH     0    17    0
22    Level2    A    FIRST     2     0    0
23    Level2    A    SECOND    1     9    0
24    Level2    A    THIRD     3     7    0
25    Level2    A    FOURTH    9    18    0
26    Level2    A    FIFTH     1    22    0
27    Level2    A    SIXTH     1    13    0
28    Level2    B    FIRST     0     0    0
29    Level2    B    SECOND    3     9    0
30    Level2    B    THIRD     5    85    0
31    Level2    B    FOURTH    4    96    0
32    Level2    B    FIFTH     3    40    0
33    Level2    B    SIXTH     0    17    0
34    Level3    A    FIRST     2     0    0
35    Level3    A    SECOND    1     9    0
36    Level3    A    THIRD     3     7    0
37    Level3    A    FOURTH    9    18    0
38    Level3    A    FIFTH     1    22    0
39    Level3    A    SIXTH     1    13    0
40    Level3    B    FIRST     0     0    0
41    Level3    B    SECOND    3     9    0
42    Level3    B    THIRD     开发者_StackOverflow社区5    85    0
43    Level3    B    FOURTH    4    96    0
44    Level3    B    FIFTH     3    40    0
45    Level3    B    SIXTH     0    17    0

My desired result is:

    Flag1   Flag2   Flag3   Type1   Type2   Type3   Sum
1   Level1  A   FIRST   2   0   0   2
2   Level1  A   SECOND  1   9   0   12
3   Level1  A   THIRD   3   7   0   22
4   Level1  A   FOURTH  9   18  0   49
5   Level1  A   FIFTH   1   22  0   72
6   Level1  A   SIXTH   1   13  0   86
7   Level1  B   FIRST   0   0   0   0
8   Level1  B   SECOND  3   9   0   12
9   Level1  B   THIRD   5   85  0   102
10  Level1  B   FOURTH  4   96  0   202
11  Level1  B   FIFTH   3   40  0   245
12  Level1  B   SIXTH   0   17  0   262
13  Level1  (all)   FIRST   2   0   0   2
14  Level1  (all)   SECOND  4   18  0   24
15  Level1  (all)   THIRD   8   92  0   124
16  Level1  (all)   FOURTH  13  114 0   251
17  Level1  (all)   FIFTH   4   62  0   317
18  Level1  (all)   SIXTH   1   30  0   348
19  Level1  A   (all)   17  68  0   85
20  Level1  B   (all)   15  247 0   262
21  Level1  (all)   (all)   32  315 0   347
22  Level2  A   FIRST   2   0   0   2
23  Level2  A   SECOND  1   9   0   12
24  Level2  A   THIRD   3   7   0   22
25  Level2  A   FOURTH  9   18  0   49
26  Level2  A   FIFTH   1   22  0   72
27  Level2  A   SIXTH   1   13  0   86
28  Level2  B   FIRST   0   0   0   0
29  Level2  B   SECOND  3   9   0   12
30  Level2  B   THIRD   5   85  0   102
31  Level2  B   FOURTH  4   96  0   202
32  Level2  B   FIFTH   3   40  0   245
33  Level2  B   SIXTH   0   17  0   262
34  Level2  (all)   FIRST   2   0   0   2
35  Level2  (all)   SECOND  4   18  0   24
36  Level2  (all)   THIRD   8   92  0   124
37  Level2  (all)   FOURTH  13  114 0   251
38  Level2  (all)   FIFTH   4   62  0   317
39  Level2  (all)   SIXTH   1   30  0   348
40  Level2  A   (all)   17  68  0   85
41  Level2  B   (all)   15  247 0   262
42  Level2  (all)   (all)   32  315 0   347
43  Level3  A   FIRST   2   0   0   2
44  Level3  A   SECOND  1   9   0   12
45  Level3  A   THIRD   3   7   0   22
46  Level3  A   FOURTH  9   18  0   49
47  Level3  A   FIFTH   1   22  0   72
48  Level3  A   SIXTH   1   13  0   86
49  Level3  B   FIRST   0   0   0   0
50  Level3  B   SECOND  3   9   0   12
51  Level3  B   THIRD   5   85  0   102
52  Level3  B   FOURTH  4   96  0   202
53  Level3  B   FIFTH   3   40  0   245
54  Level3  B   SIXTH   0   17  0   262
55  Level3  (all)   FIRST   2   0   0   2
56  Level3  (all)   SECOND  4   18  0   24
57  Level3  (all)   THIRD   8   92  0   124
58  Level3  (all)   FOURTH  13  114 0   251
59  Level3  (all)   FIFTH   4   62  0   317
60  Level3  (all)   SIXTH   1   30  0   348
61  Level3  A   (all)   17  68  0   85
62  Level3  B   (all)   15  247 0   262
63  Level3  (all)   (all)   32  315 0   347
64  (all)   A   FIRST   6   0   0   6
65  (all)   A   SECOND  9   27  0   42
66  (all)   A   THIRD   18  48  0   108
67  (all)   A   FOURTH  45  102 0   255
68  (all)   A   FIFTH   48  168 0   471
69  (all)   A   SIXTH   51  207 0   729
70  (all)   B   FIRST   0   0   0   0
71  (all)   B   SECOND  9   27  0   36
72  (all)   B   THIRD   24  282 0   342
73  (all)   B   FOURTH  36  570 0   948
74  (all)   B   FIFTH   45  690 0   1683
75  (all)   B   SIXTH   45  741 0   2469
76  (all)   A   (all)   51  552 0   1611
77  (all)   B   (all)   45  2310    0   5478
78  (all)   (all)   (all)   96  2862    0   7089

Thanks


It's not clear what you want here, since you mention "aggregating" and "previous" enitities, but if you want the rowsums followed by the colsums for a dataframe named dfrm, then this works:

dfrm$totals <- rowSums(dfrm[, 4:6])
dfrmT <- rbind(dfrm, data.frame(Flag1="all", Flag2="all", Flag3="all", 
                                  t( colSums(dfrm[, 4:7]) )
                ) )
dfrmT

EDIT: So you did want the row sums and then within levels of Flag1 and Flag2 you wanted cumsums: (And if you want the multiple subtotals, that was answered in your earlier question.) R: calculating margins or row & col sums for a data frame

dfrmT$CS <- ave(   dfrmT$totals, list(dfrmT$Flag1, dfrmT$Flag2), FUN=cumsum)
dfrmT

     Flag1 Flag2  Flag3 Type1 Type2 Type3 totals  CS
1   Level1     A  FIRST     2     0     0      2   2
2   Level1     A SECOND     1     9     0     10  12
3   Level1     A  THIRD     3     7     0     10  22
4   Level1     A FOURTH     9    18     0     27  49
5   Level1     A  FIFTH     1    22     0     23  72
6   Level1     A  SIXTH     1    13     0     14  86
7   Level1     B  FIRST     0     0     0      0   0
8   Level1     B SECOND     3     9     0     12  12
9   Level1     B  THIRD     5    85     0     90 102
10  Level1     B FOURTH     4    96     0    100 202
11  Level1     B  FIFTH     3    40     0     43 245
12  Level1     B  SIXTH     0    17     0     17 262
22  Level2     A  FIRST     2     0     0      2   2
23  Level2     A SECOND     1     9     0     10  12
24  Level2     A  THIRD     3     7     0     10  22
25  Level2     A FOURTH     9    18     0     27  49
26  Level2     A  FIFTH     1    22     0     23  72
27  Level2     A  SIXTH     1    13     0     14  86
28  Level2     B  FIRST     0     0     0      0   0
29  Level2     B SECOND     3     9     0     12  12
30  Level2     B  THIRD     5    85     0     90 102
31  Level2     B FOURTH     4    96     0    100 202
32  Level2     B  FIFTH     3    40     0     43 245
33  Level2     B  SIXTH     0    17     0     17 262
34  Level3     A  FIRST     2     0     0      2   2
35  Level3     A SECOND     1     9     0     10  12
36  Level3     A  THIRD     3     7     0     10  22
37  Level3     A FOURTH     9    18     0     27  49
38  Level3     A  FIFTH     1    22     0     23  72
39  Level3     A  SIXTH     1    13     0     14  86
40  Level3     B  FIRST     0     0     0      0   0
41  Level3     B SECOND     3     9     0     12  12
42  Level3     B  THIRD     5    85     0     90 102
341    all   all    all    89   795     0    884 884

If on the other hand you want cumsums within each Type with initialization at new Flag1 and Flag2 divisions then:

ave( dfrm[ , grep("Type", names(dfrm))], list(dfrm$Flag1, dfrm$Flag2), FUN=cumsum)
   Type1 Type2 Type3
1      2     0     0
2      3     9     0
3      6    16     0
4     15    34     0
5     16    56     0
6     17    69     0
7      0     0     0
8      3     9     0
9      8    94     0
10    12   190     0
11    15   230     0
12    15   247     0
22     2     0     0
23     3     9     0
24     6    16     0
25    15    34     0
26    16    56     0
27    17    69     0
28     0     0     0
29     3     9     0
30     8    94     0
31    12   190     0
32    15   230     0
33    15   247     0
34     2     0     0
35     3     9     0
36     6    16     0
37    15    34     0
38    16    56     0
39    17    69     0
40     0     0     0
41     3     9     0
42     8    94     0

That is suitable for cbind()-ing to dfrm but needs work on the names():

dfrmCS <- cbind(dfrm, ave( dfrm[ , grep("Type", names(dfrm))], 
                               list(dfrm$Flag1, dfrm$Flag2), FUN=cumsum) )
names(dfrmCS)[8:10] <- paste(names(dfrmCS)[8:10], "CS", sep="_")
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜