Counting non NAs in a data frame; getting answer as a vector
Say I have the following R data.frame ZZZ
:
( ZZZ <- structure(list(n = c(1, 2, NA), m = c(6, NA, NA), o = c(7, 8,
8)), .Names = c("n", "m", "o"), row.names = c(NA, -3L), class = "data.frame") )
## not run
n m o
1 1 6 7
2 2 NA 8
3 NA NA 8
I want to know, in the form of a vector, how many non-NAs I've got. I want the answer available to me as:
2, 1, 3
When I use the command length(ZZZ)
, I get 3
, which of course is the number of vectors in the dat开发者_StackOverflow社区a.frame, a valuable enough piece of information.
I have other functions that operate on this data.frame and give me answers in the form of vectors, but, dang-it, length doesn't operate like that.
colSums(!is.na(x))
Vectorisation ftw.
Try this:
# define "demo" dataset
ZZZ <- data.frame(n=c(1,2,NA),m=c(6,NA,NA),o=c(7,8,8))
# apply the counting function per columns
apply(ZZZ, 2, function(x) length(which(!is.na(x))))
Having run:
> apply(ZZZ, 2, function(x) length(which(!is.na(x))))
n m o
2 1 3
If you really insist on returning a vector, you might use as.vector
, e.g. by defining this function:
nonNAs <- function(x) {
as.vector(apply(x, 2, function(x) length(which(!is.na(x)))))
}
You could simply run nonNAs(ZZZ)
:
> nonNAs(ZZZ)
[1] 2 1 3
For getting total no of missing values use sum(is.na(x)) and for colum-wise use colSums(is.na(x)) where x is varible that contain dataset
If you only want the sum total of NAs overall, then sum() with !is.na() will do it:
ZZZ <- data.frame(n = c(1, 2, NA), m = c(6, NA, NA), o = c(7, 8, 8))
sum(!is.na(ZZZ))
精彩评论