Assign pass/fail value based on mean in large dataset

2023-03-22 09:11 问答作者：

this might be a simple question but I was hoping someone could point me in the right direction. I have a sample dataset of:

dfrm <- list(L = c("A","B","P","C","D","E","P","F"), J=c(2,2,1,2,2,2,1,2), K=c(4,3,10,16,21,3,17,2))
 dfrm <-as.data.frame(dfrm)
 dfrm
  L J  K
1 A 2  4
2 B 2  3
3 P 1 10
4 C 2 1开发者_C百科6
5 D 2 21
6 E 2  3
7 P 1 17
8 F 2  2

Column J specifies the type of variable that is defined in K. I want to be able to take the mean of the K values that have a 1 assigned next to them. In this example it would be 10 and 17

T = c(10,17)
mean(T)
13.5

Next I want to be able to assign a pass/fail rank, where pass = 1, fail = 0 to identify whether the number in column K is larger than the mean.

The final data set should look like:

cdfrm <- list(L = c("A","B","P","C","D","E","P","F"), J=c(2,2,1,2,2,2,1,2), K=c(4,3,10,16,21,3,17,2),C = c(0,0,0,1,1,0,1,0))
cdfrm <-as.data.frame(cdfrm)
 cdfrm
  L J  K C
1 A 2  4 0
2 B 2  3 0
3 P 1 10 0
4 C 2 16 1
5 D 2 21 1
6 E 2  3 0
7 P 1 17 1
8 F 2  2 0

this seems so basic, i am sorry guys, I just don't know what I am overthinking.

There are two steps in the solution. The first is to calculate the mean for the value you are interested in. In other words, take the mean of a subset of values in your data.frame. R has a handy function to calculate subsets, called subset. Here it is in action:

meanK <- mean(subset(dfrm, subset=J==1, select=K))
meanK
K 
13.5

Next, you want to compare column K in your data frame with the mean value we have just calculated. This is a straightforward vector comparison:

dfrm$Pass <- dfrm$K>meanK
dfrm
L J  K  Pass
1 A 2  4 FALSE
2 B 2  3 FALSE
3 P 1 10 FALSE
4 C 2 16  TRUE
5 D 2 21  TRUE
6 E 2  3 FALSE
7 P 1 17  TRUE
8 F 2  2 FALSE

Here's how to do it in one line

transform(dfrm, C = K > sapply(split(dfrm$K, dfrm$J), mean)[J])

split groups the values of K according to the values of J and sapply(..., mean) calculates group wise means.

继续阅读：conditional-statements mean

Assign pass/fail value based on mean in large dataset

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？