Handling NA values in apply and unique

2022-12-20 10:43 问答作者：

I have a 114 row by 16 column data frame where the rows are individuals, and the columns are either their names or NA. For example, the first 3 rows looks like this:

            name name.1      name.2 name.3       name.4 name.5       name.6 name.7       name.8 name.9       name.10 nam开发者_如何学Ce.11       name.12 name.13        name.14 name.15
1           <NA>   <NA>        <NA>   <NA>         <NA>   <NA>         <NA>   <NA>         <NA>   <NA>      Aanestad    <NA>      Aanestad    <NA>       Aanestad    <NA>
2           <NA>   <NA>        <NA>   <NA>         <NA>   <NA>         <NA>   <NA>     Ackerman   <NA>      Ackerman    <NA>      Ackerman    <NA>       Ackerman    <NA>
3           <NA>   <NA>        <NA>   <NA>         <NA>   <NA>      Alarcon   <NA>      Alarcon   <NA>       Alarcon    <NA>       Alarcon    <NA>           <NA>    <NA>

I want to generate a list (if multiple unique names per row) or vector (if only one unique name per row) of all the unique names, with length 114.

When I try apply(x,1,unique) I get a 2xNcol array where sometimes the first row cell is NA and sometimes the second row cell is NA.

    [,1]       [,2]       [,3]      [,4]     [,5]      [,6]      [,7]    [,8]   [,9]    
[1,] NA         NA         NA        NA       "Alquist" NA        "Ayala" NA     NA      
[2,] "Aanestad" "Ackerman" "Alarcon" "Alpert" NA        "Ashburn" NA      "Baca" "Battin"

When what I'd like is just:

Aanestad
Ackerman
Alarcon
...

I can't seem to figure out how to apply unique() while ignoring NA. na.rm, na.omit etc don't seem to work. I feel like I'm missing something real simple ...

Thanks!

unique does not appear to have an na.rm argument, but you can remove the missing values yourself before calling it:

A <- matrix(c(NA,"A","A",
             "B", NA, NA,
              NA, NA, "C"), nr=3, byrow=TRUE)
apply(A, 1, function(x)unique(x[!is.na(x)]))

gives

[1] "A" "B" "C"

You were very, very close in your initial solution. But as Aniko remarked, you have to remove NA values before you can use unique.

An example where we first create a similar data.frame and then use apply() as you did -- but with an additional anonymous function that is used to combine na.omit() and unique():

R> DF <- t(data.frame(foo=sample(c(NA, "Foo"), 5, TRUE), 
                      bar=sample(c(NA, "Bar"), 5, TRUE)))
R> DF
    [,1]  [,2] [,3]  [,4]  [,5] 
foo "Foo" NA   "Foo" "Foo" "Foo"
bar NA    NA   NA    "Bar" "Bar"
R> apply(DF, 1, function(x) unique(na.omit(x)))
  foo   bar 
"Foo" "Bar"

继续阅读：apply r unique

Handling NA values in apply and unique

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？