开发者

R: Iteration by variable

I have the following dataset1:

Height | Group
1,556  |  A
2,111  |  B
1,556  |  A
2,341  |  B
1,256  |  A
2,411  |  B

I would like to compute shapiro wilk normality test for Height by variable Group

myvar开发者_运维知识库 <- c("Height")

res<- vector("list", length(myvars))

a <- factor(dataset1$Group)
myfactor <- levels(a)

i=1
for (myfactor in dataset1) {
    res[[i]] <- shapiro.test(dataset1$Size)
    i=i+1
}

res - returns n groups of tests, but all with same p-value and W. Can anyone help me figure out what's wrong?


It is easier to write new code than find all errors in your code.

lapply(split(dataset1$Height,dataset1$Group),shapiro.test)

$`  A`

        Shapiro-Wilk normality test

data:  X[[1L]] 
W = 0.75, p-value = 3.031e-08


$`  B`

       Shapiro-Wilk normality test

data:  X[[2L]] 
W = 0.9134, p-value = 0.4295


Your code is hosed is all sorts of ways. Here are a few:

  1. You create myfactor outside of the loop, but then you make it the iterator.
  2. dataset1 is your data (data.frame?). I'm not even sure what myfactor will be inside a loop created by for (myfactor in dataset1).
  3. You don't subset the data sent to shapiro.test.
  4. myvars isn't defined and dataset1$Size should probably be dataset1$Height.

Try this instead.

res <- list()
for (mf in levels(dataset1$Group)) {
    res[[mf]] <- shapiro.test(dataset1$Height[dataset1$Group == mf])
}


Thanks for the reply.
For future notice:
If you wish to compute (for selected variables in a dataset) a normality test by factor:

variaveis <- colnames(dataset1)[c(1:2)]
/////alternative: variaveis <- c("height", "weight") 
res<- vector("list", length(variaveis))

for (i in 1:length(variaveis)) {
    #calcula o shapiro por factor para variaveis selecionadas
    res[[i]] <- lapply(split(dataset1[,variaveis[i]] ,dataset1$sex), shapiro.test)
}
res

PS: sex = GROUP in the previous example
Again Thanks
Wish this code helps reducing code M.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜