R: Iteration by variable
I have the following dataset1:
Height | Group
1,556 | A
2,111 | B
1,556 | A
2,341 | B
1,256 | A
2,411 | B
I would like to compute shapiro wilk normality test for Height by variable Group
myvar开发者_运维知识库 <- c("Height")
res<- vector("list", length(myvars))
a <- factor(dataset1$Group)
myfactor <- levels(a)
i=1
for (myfactor in dataset1) {
res[[i]] <- shapiro.test(dataset1$Size)
i=i+1
}
res - returns n groups of tests, but all with same p-value and W. Can anyone help me figure out what's wrong?
It is easier to write new code than find all errors in your code.
lapply(split(dataset1$Height,dataset1$Group),shapiro.test)
$` A`
Shapiro-Wilk normality test
data: X[[1L]]
W = 0.75, p-value = 3.031e-08
$` B`
Shapiro-Wilk normality test
data: X[[2L]]
W = 0.9134, p-value = 0.4295
Your code is hosed is all sorts of ways. Here are a few:
- You create
myfactor
outside of the loop, but then you make it the iterator. dataset1
is your data (data.frame?). I'm not even sure whatmyfactor
will be inside a loop created byfor (myfactor in dataset1)
.- You don't subset the data sent to
shapiro.test
. myvars
isn't defined anddataset1$Size
should probably bedataset1$Height
.
Try this instead.
res <- list()
for (mf in levels(dataset1$Group)) {
res[[mf]] <- shapiro.test(dataset1$Height[dataset1$Group == mf])
}
Thanks for the reply.
For future notice:
If you wish to compute (for selected variables in a dataset) a normality test by factor:
variaveis <- colnames(dataset1)[c(1:2)]
/////alternative: variaveis <- c("height", "weight")
res<- vector("list", length(variaveis))
for (i in 1:length(variaveis)) {
#calcula o shapiro por factor para variaveis selecionadas
res[[i]] <- lapply(split(dataset1[,variaveis[i]] ,dataset1$sex), shapiro.test)
}
res
PS: sex = GROUP in the previous example
Again Thanks
Wish this code helps reducing code
M.
精彩评论