开发者

R table with country names in cells

My aim is to create a table that summarizes the countries featured in my sample. This table should only have two开发者_JAVA技巧 rows, a first row with different columns for each region and a second row with country names that are located in the respective region.

To give you an example, this is what my data.frame XYZ looks like:

..................wvs5red2.s003names.....wvs5red2.regiondummies
21............."Hong Kong"......................Asian Tigers
45............."South Korea"....................Asian Tigers
49............."Taiwan".............................Asian Tigers
66............."China"...............................East Asia & Pacific
80............."Indonesia"........................East Asia & Pacific
86............."Malaysia"...........................East Asia & Pacific 

My aim is to obtain a table that looks similar to this:

region.............Asian Tigers..............................................East Asia & Pacific
countries........Hong Kong, South Korea, Taiwan...........China, Indonesia, etc.

Do you have any idea how to obtain such a table? It took me hours searching for something similar.


Simplest way is tapply:

XYZ <- structure(list(
    names = structure(c(2L, 5L, 6L, 1L, 3L, 4L), .Label = c("China", "Hong Kong", "Indonesia", "Malaysia", "South Korea", "Taiwan"), class = "factor"),
    region = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("Asian Tigers", "East Asia & Pacific"), class = "factor")),
    .Names = c("names", "region"), row.names = c(NA, -6L), class = "data.frame")

tapply(XYZ$names, XYZ$region, paste, collapse=", ")
#                     Asian Tigers              East Asia & Pacific 
# "Hong Kong, South Korea, Taiwan"     "China, Indonesia, Malaysia" 


Recreate the data:

dat <- data.frame(
    country = c("Hong Kong", "South Korea", "Taiwan", "China", "Indonesia", "Malaysia"),
    region = c(rep("Asian Tigers", 3), rep("East Asia & Pacific", 3))
)
dat

      country              region
1   Hong Kong        Asian Tigers
2 South Korea        Asian Tigers
3      Taiwan        Asian Tigers
4       China East Asia & Pacific
5   Indonesia East Asia & Pacific
6    Malaysia East Asia & Pacific

Use ddply in package plyr combined with paste to summarise the data:

library(plyr)
ddply(dat, .(region), function(x)paste(x$country, collapse= ","))

               region                           V1
1        Asian Tigers Hong Kong,South Korea,Taiwan
2 East Asia & Pacific     China,Indonesia,Malaysia


First create data:

> country<-c("Hong Kong","Taiwan","China","Indonesia")
> region<-rep(c("Asian Tigers","East Asia & Pacific"),each=2)
> df<-data.frame(country=country,region=region)

Then run through column region and gather all the countries. We can use tapply, but I will use dlply from package plyr, since it retains list names.

> ll<-dlply(df,~region,function(d)paste(d$country,collapse=","))
> ll
$`Asian Tigers`
[1] "Hong Kong,Taiwan"

$`East Asia & Pacific`
[1] "China,Indonesia"

attr(,"split_type")
[1] "data.frame"
attr(,"split_labels")
               region
1        Asian Tigers
2 East Asia & Pacific

Now convert the list to the data.frame using do.call. Since we need nice names we need to pass argument check.names=FALSE:

> ll$check.names <- FALSE
> do.call("data.frame",ll)
      Asian Tigers East Asia & Pacific
1 Hong Kong,Taiwan     China,Indonesia
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜