R table with country names in cells
My aim is to create a table that summarizes the countries featured in my sample. This table should only have two开发者_JAVA技巧 rows, a first row with different columns for each region and a second row with country names that are located in the respective region.
To give you an example, this is what my data.frame
XYZ
looks like:
..................wvs5red2.s003names.....wvs5red2.regiondummies
21............."Hong Kong"......................Asian Tigers
45............."South Korea"....................Asian Tigers
49............."Taiwan".............................Asian Tigers
66............."China"...............................East Asia & Pacific
80............."Indonesia"........................East Asia & Pacific
86............."Malaysia"...........................East Asia & Pacific
My aim is to obtain a table that looks similar to this:
region.............Asian Tigers..............................................East Asia & Pacific
countries........Hong Kong, South Korea, Taiwan...........China, Indonesia, etc.
Do you have any idea how to obtain such a table? It took me hours searching for something similar.
Simplest way is tapply
:
XYZ <- structure(list(
names = structure(c(2L, 5L, 6L, 1L, 3L, 4L), .Label = c("China", "Hong Kong", "Indonesia", "Malaysia", "South Korea", "Taiwan"), class = "factor"),
region = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("Asian Tigers", "East Asia & Pacific"), class = "factor")),
.Names = c("names", "region"), row.names = c(NA, -6L), class = "data.frame")
tapply(XYZ$names, XYZ$region, paste, collapse=", ")
# Asian Tigers East Asia & Pacific
# "Hong Kong, South Korea, Taiwan" "China, Indonesia, Malaysia"
Recreate the data:
dat <- data.frame(
country = c("Hong Kong", "South Korea", "Taiwan", "China", "Indonesia", "Malaysia"),
region = c(rep("Asian Tigers", 3), rep("East Asia & Pacific", 3))
)
dat
country region
1 Hong Kong Asian Tigers
2 South Korea Asian Tigers
3 Taiwan Asian Tigers
4 China East Asia & Pacific
5 Indonesia East Asia & Pacific
6 Malaysia East Asia & Pacific
Use ddply
in package plyr
combined with paste
to summarise the data:
library(plyr)
ddply(dat, .(region), function(x)paste(x$country, collapse= ","))
region V1
1 Asian Tigers Hong Kong,South Korea,Taiwan
2 East Asia & Pacific China,Indonesia,Malaysia
First create data:
> country<-c("Hong Kong","Taiwan","China","Indonesia")
> region<-rep(c("Asian Tigers","East Asia & Pacific"),each=2)
> df<-data.frame(country=country,region=region)
Then run through column region
and gather all the countries. We can use tapply
, but I will use dlply
from package plyr, since it retains list names.
> ll<-dlply(df,~region,function(d)paste(d$country,collapse=","))
> ll
$`Asian Tigers`
[1] "Hong Kong,Taiwan"
$`East Asia & Pacific`
[1] "China,Indonesia"
attr(,"split_type")
[1] "data.frame"
attr(,"split_labels")
region
1 Asian Tigers
2 East Asia & Pacific
Now convert the list to the data.frame
using do.call
. Since we need nice names we need to pass argument check.names=FALSE
:
> ll$check.names <- FALSE
> do.call("data.frame",ll)
Asian Tigers East Asia & Pacific
1 Hong Kong,Taiwan China,Indonesia
精彩评论