开发者

How to access elements in a complex list?

I have a nice list, which looks like this:

tmp = NULL
t = NULL
tmp$resultitem$count = "1057230"
tmp$resultitem$status = "Ok"
tmp$resultitem$menu = "开发者_高级运维PubMed"
tmp$resultitem$dbname = "pubmed"
t$resultitem$count = "305215"
t$resultitem$status = "Ok"
t$resultitem$menu = "PMC"
t$resultitem$dbname = "pmc"
tmp = c(tmp, t)
t = NULL
t$resultitem$count = "1"
t$resultitem$status = "Ok"
t$resultitem$menu = "Journals"
t$resultitem$dbname = "journals"
tmp = c(tmp, t)

Which produces:

> str(tmp)
List of 3
 $ resultitem:List of 4
  ..$ count : chr "1057230"
  ..$ status: chr "Ok"
  ..$ menu  : chr "PubMed"
  ..$ dbname: chr "pubmed"
 $ resultitem:List of 4
  ..$ count : chr "305215"
  ..$ status: chr "Ok"
  ..$ menu  : chr "PMC"
  ..$ dbname: chr "pmc"
 $ resultitem:List of 4
  ..$ count : chr "1"
  ..$ status: chr "Ok"
  ..$ menu  : chr "Journals"
  ..$ dbname: chr "journals"

Now I want to search through the elements of each resultitem. I want to know the dbname for every database, that has less then 10 count (example). In this case it is very easy, as this list only has 3 elements, but the real list is a little bit longer.

This could be simply done with a for loop. But is there a way to do this with some other function of R (like rapply)? My problem with those apply functions is, that they only look at one element.

If I do a grep to get all dbname elements, I can not get the count of each element.

rapply(tmp, function(x) paste("Content: ", x))[grep("dbname", names(rapply(tmp, c)))]

Does someone has a better idea than a for loop?


R generally wants to handle these things as data.frames, so I think your best bet is to turn your list into one (or even make a data.frame instead of a list to begin with, unless you need it to be in list form).

x <- do.call(rbind,tmp)
dat <- data.frame(x)
dat$count <- as.numeric(dat$count)

> dat
    count status     menu   dbname
1 1057230     Ok   PubMed   pubmed
2  305215     Ok      PMC      pmc
3       1     Ok Journals journals

and then to get your answer(s) you can use normal data.frame subsetting operations:

> dat$dbname[dat$count<10]
$resultitem
[1] "journals"


If you're absolutely insistent that you must do this in a list the following will work for the present case.

x <- tmp[sapply(tmp, function(x){x$count>10})]
str(x)
(the list items you wanted)

More generally, if you would like to actually use ragged lists in this way you could use the same code but check for the presence of the item first

testForCount <- function(x) {if ('count' %in% names(x)) x$count>10 else FALSE}
tmp[sapply (tmp, count)]

This will work for your cases where the lists are not the same length as well as the present case. (I still think you should be using data frames for both speed and sensible representation of the data).


It looks like your list comes from an XML structure. It is easier to navigate to what you want with XPath and using NodeSet structure and function getNodeSet in the XML package

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜