R dataframe filtering
I have a dataframe df as follows:
A B C
NA 1 2
2 NA 3
4 5 6
7 8 9
what I want to do is remove all the rows that has NA
.
if I use
apply(df,1,function(row) all(!is.na(row)))
I get the list of all the rows with TRUE (if the row does not contain a NA) and FALSE(if the row contains a NA). But how do I get the rowname such that I can create some li开发者_StackOverflow中文版ke
df2<-df[-c(list of rows that contains NA),]
which will give me all the new dataframe with NA in rows.
Thanks in advance.
Assuming you have a dataframe that looks like this:
A B C
1 NA 1 2
2 2 NA 3
3 4 5 6
4 7 8 9
Then try:
df1[apply(df1,1,function(x) !any(is.na(x))), ]
A B C
3 4 5 6
4 7 8 9
It doesn't use rownames but rather a logical vector. I guess Joshua and I read you question differently but we used the same method.
Joshua's suggestion is more compact:
> na.omit(df1)
A B C
3 4 5 6
4 7 8 9
And it reminds me that I should have used:
> df1[complete.cases(df1), ]
A B C
3 4 5 6
4 7 8 9
You can use the logical vector from your apply
call to index your data.frame.
> Data[!apply(Data,1,function(row) all(!is.na(row))),]
A B C
1 NA 1 2
2 2 NA 3
> # or like this:
> Data[apply(Data,1,function(row) any(is.na(row))),]
A B C
1 NA 1 2
2 2 NA 3
is.na
on a data.frame
returns a matrix
, which is a better candidate for apply:
df <- read.table(textConnection(" A B C
NA 1 2
2 NA 3
4 5 6
7 8 9
"))
## a matrix
is.na(df)
## logical for selecting rows that are all NA
apply(df, 1, function(x) all(is.na(x)))
## one liner
df[!apply(df, 1, function(x) all(is.na(x))), ]
精彩评论