How to take the union of element in a nested list in R

2023-02-28 06:27 问答作者：

I have a nested list in say lst(all the elements are of class int). I don't know the length of lst in advance; however I do know that each element of lst is a list of length say k

length(lst[[i]]) # this equals k and is known in advance, 
                 # this is true for i = 1 ... length(lst)

How do I take the union of the 1st element, 2nd element, ..., kth element of all the elements of lst

Specifically, if the length of lst is n, I want (not R code):

# I know that union can only be taken for 2 elements, 
# following开发者_开发百科 is for illustration purposes
listUnion1 <- union(lst[[1, 1]], lst[[2, 1]], ..., lst[[n, 1]])
listUnion2 <- union(lst[[1, 2]], lst[[2, 2]], ..., lst[[n, 2]])
.
.
.
listUnionk <- union(lst[[1, k]], lst[[2, k]], ..., lst[[n, k]])

Any help or pointers are greatly appreciated.

Here is a dataset that can be used, n = 3 and k = 2

list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), 
    structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), 
    structure(list(a = 12, b = 12), .Names = c("a", "b")))

Here is a general solution, similar in spirit to that of @Ramnath, but avoiding the use of union() which is a binary function. The trick is to note that union() is implemented as:

unique(c(as.vector(x), as.vector(y)))

and the bit inside unique() can be achieved by unlisting the nth component of each list.

The full solution then is:

unionFun <- function(n, obj) {
    unique(unlist(lapply(obj, `[[`, n)))
}
lapply(seq_along(lst[[1]]), FUN = unionFun, obj = lst)

which gives:

[[1]]
 [1]  1  2  3  4  5  6  7  8  9 10 11 12

[[2]]
 [1]  6  7  8  9 10 11  1  2  3  4  5 12

on the data you showed.

A couple of useful features of this are:

we use `[[` to subset obj in unionFun. This is similar to function(x) x$a in @Ramnath's Answer. However, we don't need an anonymous function (we use `[[` instead). The equivalent to @Ramnath's Answer is: lapply(lst, `[[`, 1)
to generalise the above, we replace the 1 above with n in unionFun(), and allow our list to be passed in as argument obj.

Now that we have a function that will provide the union of the nth elements of a given list, we can lapply() over the indices k, applying our unionFun() to each sub-element of lst, using the fact that the length of lst[[1]] is the same as length(lst[[k]]) for all k.

If it helps to have the names of the nth elements in the returned object, we can do:

> unions <- lapply(seq_along(lst[[1]]), FUN = unionFun, obj = lst)
> names(unions) <- names(lst[[1]])
> unions
$a
 [1]  1  2  3  4  5  6  7  8  9 10 11 12

$b
 [1]  6  7  8  9 10 11  1  2  3  4  5 12

Here is one solution

# generate dummy data
x1 = sample(letters[1:5], 20, replace = T)
x2 = sample(letters[1:5], 20, replace = T)
df = data.frame(x1, x2, stringsAsFactors = F)

# find unique elements in each column
union_df = apply(df, 2, unique)

Let me know if this works

EDIT: Here is a solution for lists using the data you provided

mylist = list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), 
              structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), 
              structure(list(a = 12, b = 12), .Names = c("a", "b")))
list_a = lapply(mylist, function(x) x$a)
list_b = lapply(mylist, function(x) x$b)

union_a = Reduce(union, list_a)
union_b = Reduce(union, list_b)

If you have more than 2 elements in your list, we could generalize this code.

Here's another way: Use do.call/rbind to line up the lists by "name" into a data-frame, then apply unique/do.call to each column of this data-frame. ( I modified your data slightly so the 'a' and 'b' unions are of different lengths, to make sure it works correctly).

lst <- list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), 
    structure(list(a = 6:10, b = 1:5), .Names = c("a", "b")), 
    structure(list(a = 12, b = 12), .Names = c("a", "b")))

> apply(do.call(rbind, lst),2, function( x ) unique( do.call( c, x)))
$a
 [1]  1  2  3  4  5  6  7  8  9 10 12

$b
 [1]  6  7  8  9 10 11  1  2  3  4  5 12

Your data

df <- list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), 
           structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), 
           structure(list(a = 12, b = 12), .Names = c("a", "b")))

This gives you the unique values of the nested lists:

library(plyr)
df.l <- llply(df, function(x) unlist(unique(x)))

R> df.l
[[1]]
 [1]  1  2  3  4  5  6  7  8  9 10 11

[[2]]
 [1]  6  7  8  9 10 11  1  2  3  4  5

[[3]]
[1] 12

EDIT

Thanks to Ramnath I changed the code a bit and hope this answer fits the needs of your question. For illustration I keep the previous answer as well. The slightly changed data has now an additional list.

df <- list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), 
           structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), 
           structure(list(a = 12, b = 12, c = 10:14), .Names = c("a", "b", "c")))


f.x <- function(x.list) {
  x.names <- names(x.list)
  i <- combn(x.names, 2)
  l <- apply(i, 2, function(y) x.list[y])
  llply(l, unlist)
}

Now you can apply the function to your data.

all.l <- llply(df, f.x)
llply(all.l, function(x) llply(x, unique))

R> [[1]]
[[1]][[1]]
 [1]  1  2  3  4  5  6  7  8  9 10 11


[[2]]
[[2]][[1]]
 [1]  6  7  8  9 10 11  1  2  3  4  5


[[3]]
[[3]][[1]]
[1] 12

[[3]][[2]]
[1] 12 10 11 13 14

[[3]][[3]]
[1] 12 10 11 13 14

However, the nested structure is not very user friendly. That could be changed a bit...

According to the documentation "unlist" is a recursive function, hence regardless of the nesting level of the lists supplied you can get all elements by passing them to unlist. You can get the union of the sublists as follows.

lst <- list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), 
structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), 
structure(list(a = 12, b = 12), .Names = c("a", "b")))

lapply(lst, function(sublst) unique(unlist(sublst)))

[[1]]
[1]  1  2  3  4  5  6  7  8  9 10 11

[[2]]
[1]  6  7  8  9 10 11  1  2  3  4  5

[[3]]
[1] 12

继续阅读：r

How to take the union of element in a nested list in R

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？