I have loaded a dataset, D, into R and I would like to perform a frequency of all the variables in D versus D$binary_outcome. How do I do that?
I have loaded a dataset, D, into R and I would like to perform a frequency of all the variables in D versus D$binary_outcome. How do I do t开发者_运维问答hat?
I would like to know if there is some code that is fairly generic and D may have any number of variables and the code should be able to handle a dataset with any number of variables.
In effect I want to be able to do something like
d = read.csv("c:/d.csv")
d.freq.varA = table(d$varA,d$binary_outcome)
d.freq.varB = table(d$varB,d$binary_outcome)
...
d.freq.varZZZ = table(d$varZZZ,d$binary_outcome)
for all variables A to ZZZ in d.
I think this should get you somewhere. It might look better in a loop.
lapply(names(d)[grep('var', names(d))],
function(name){
assign(name, table(d[,name],d$binary_outcome),
envir = .GlobalEnv)
}
)
Does every variable have the same levels? If so, if youreshape::melt()
the data first, you can create one multidimensional table.
d.m <- melt(d, id = "binary_outcome")
freq.all.vars <- with(d.m, table(binary_outcome, value, variable))
freq.var.a <- freq.all.vars[,,"varA"]
精彩评论