开发者

Two data formatting questions for R

I have two questions, both are pretty simple I believe dealing with R.

I would like to create a IF statement that will assign a NA value to certain rows in a column. I have tried the following command:

a[a[,21]==0,5:10] <-NA

the error says:

Error in [<-.data.frame(tmp, a[, 21] == 0, 5:20, value = NA) : missing values are not allowed in subscripted assignments of data frames

Essentially that code is supposed to take any 0 value in column 21, and replace the values for that row from columns 5 to 10 to NA. There are NA's in column 21 already, but I am not sure whether that does anything?

I am not sure how to craft this next function at all. I need to manipulate data that contains positive and negative controls. However, when I manipulate the data, I don't want the positive and negative control values to be apart of the manipulation, but I want the positive and negative controls to remain in the columns because I have to use them later. Is there anyway to temporarily ignore these values so they aren't included in the manipulation?

Here sample data:

L = c(2,1,4,3,1,4,2,4,5,1) 
R = c(2,4,5,1,"Neg",2,"",1,2,1) 
T = c(2,1,4,2,"CTRL",2,"PCTRL",2,1,4) 
test <- data.frame(L=L,R=R,T=T)

I would like to be able to temporarily ignore these rows based on the characters "Neg" "CTRL"/"" "PCTRL" rather than the position of them in the data frame if possible. Notice how for negative control, Neg and CTRL are in separate columns, same row, just like positive control where there is a blank and PCTRL in separate columns yet same rows. Any way to do this given these odd conditions?

开发者_如何学JAVA

Hope this was written clearly enough, and I thank anyone in advance for taking the time to help me!


Try this for subsetting your dataframe to those rows where R is not "Neg":

subset(test, R!="Neg")

For the NA problem, you probably already have NAs in your data frame, right? Try if this works:

a[a[,21] %in% 0, 5:10] <- NA


Try instead:

a[ which(a[,21]==0), 5:10] <-NA

Explanation: the == operation is returning NA values and the [<- function doesn't accept them. The which function will return a numeric vector and "throw away the NA's". As an aside, the [ function (without the '<-') will return all NA rows. This is considered a 'feature', but I find it to be an 'annoyance', so I will typically use which for selection as well as for selective-assignment.


For the first problem: if a[,21] is negative, do you want to assign NA? In this case,

a[replace(a[,21],is.na(a[,21]),0)==0,5:10] <- NA

Otherwise (note that I replaced replacement value of "0" with something nonzero ("1" used here but doesn't really matter as long as it's not zero),

a[replace(a[,21],is.na(a[,21]),1)==0,5:10] <- NA

As for the second problem,

subset(test,! (L %in% c("Neg","") | T %in% c("CTRL","PCTRL")))

In case the filtering conditions in L and T are not always coinciding. If they always coincide, then you can just apply test to one of L or T. Also, you may also want to keep in mind that T used to stand for TRUE in S, S-PLUS, and R (still does); you can reassign another value to T and things will be okay but I believe it's generally discouraged (same for c, which people also like to assign to).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜