开发者

I have loaded a dataset, D, into R and I would like to perform a frequency of all the variables in D versus D$binary_outcome. How do I do that?

I have loaded a dataset, D, into R and I would like to perform a frequency of all the variables in D versus D$binary_outcome. How do I do t开发者_运维问答hat?

I would like to know if there is some code that is fairly generic and D may have any number of variables and the code should be able to handle a dataset with any number of variables.

In effect I want to be able to do something like

d = read.csv("c:/d.csv")
d.freq.varA = table(d$varA,d$binary_outcome)
d.freq.varB = table(d$varB,d$binary_outcome)
...
d.freq.varZZZ = table(d$varZZZ,d$binary_outcome)

for all variables A to ZZZ in d.


I think this should get you somewhere. It might look better in a loop.

lapply(names(d)[grep('var', names(d))],
       function(name){
             assign(name, table(d[,name],d$binary_outcome), 
             envir = .GlobalEnv)
             }
      )


Does every variable have the same levels? If so, if youreshape::melt() the data first, you can create one multidimensional table.

d.m <- melt(d, id = "binary_outcome")
freq.all.vars <- with(d.m, table(binary_outcome, value, variable))

freq.var.a <- freq.all.vars[,,"varA"]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜