开发者

Ignore NA's in sapply function

I am using R and have searched around for an answer but while I have seen similar questions, it has not worked for my specific problem.

In my data set I am trying to use the NA's as placeholders because I am going to return to them once I get part of my analysis done so therefore, I would like to be able 开发者_运维技巧to do all my calculations as if the NA's weren't really there.

Here's my issue with an example data table

ROCA = c(1,3,6,2,1,NA,2,NA,1,NA,4,NA)
ROCA <- data.frame (ROCA=ROCA)       # converting it just because that is the format of my original data

#Now my function
exceedes <- function (L=NULL, R=NULL, na.rm = T)
 {
    if (is.null(L) | is.null(R)) {
        print ("mycols: invalid L,R.")
        return (NULL)               
    }
    test <-(mean(L, na.rm=TRUE)-R*sd(L,na.rm=TRUE))
  test1 <- sapply(L,function(x) if((x)> test){1} else {0})
  return (test1)
}
L=ROCA[,1]
R=.5
ROCA$newcolumn <- exceedes(L,R)
names(ROCA)[names(ROCA)=="newcolumn"]="Exceedes1"

I am getting the error:

Error in if ((x) > test) { : missing value where TRUE/FALSE needed 

As you guys know, it is something wrong with the sapply function. Any ideas on how to ignore those NA's? I would try na.omit if I could get it to insert all the NA's right where they were before, but I am not sure how to do that.


There's no need for sapply and your anonymous function because > is already vectorized.

It also seems really odd to specify default argument values that are invalid. My guess is that you're using that as a kludge instead of using the missing function. It's also good practice to throw an error rather than return NULL because you would still have to try to catch when the function returns NULL.

exceedes <- function (L, R, na.rm=TRUE)
{
  if(missing(L) || missing(R)) {
    stop("L and R must be provided")
  }
  test <- mean(L,na.rm=TRUE)-R*sd(L,na.rm=TRUE)
  as.numeric(L > test)
}

ROCA <- data.frame(ROCA=c(1,3,6,2,1,NA,2,NA,1,NA,4,NA))
ROCA$Exceeds1 <- exceedes(ROCA[,1],0.5)


This statement is strange:

test1 <- sapply(L,function(x) if((x)> test){1} else {0})

Try:

test1 <- ifelse(is.na(L), NA, ifelse(L > test, 1, 0))


Do you want NA:s in the result? That is, do you want the rows to line up?

seems like just returning L > test would work then. And adding the column can be simplified too (I suspect "Exeedes1" is in a variable somewhere).

exceedes <- function (L=NULL, R=NULL, na.rm = T)
 {
    if (is.null(L) | is.null(R)) {
        print ("mycols: invalid L,R.")
        return (NULL)               
    }
    test <-(mean(L, na.rm=TRUE)-R*sd(L,na.rm=TRUE))

    L > test
}
L=ROCA[,1]
R=.5
ROCA[["Exceedes1"]] <- exceedes(L,R)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜