Is there an equivalent of unix "comm" command in R?
I have one master file with a list of unique IDs and want to align three files with subsets of IDs alongside this, ending up with: Column 1 (id1, id2, id3, id4 etc) Column 2 (space, id2, space, space) Column 3 (id1, id2, space space) Column 4 (id1, 开发者_如何学编程space id3 space) etc. I have a unique list in R and the "comm" command in unix seems to do this - is there an equivalent in R?
The structure of your data is not very clear, but if you start with the following vectors :
R> master <- paste("id",1:10,sep="")
R> sub1 <- paste("id",c(2,3,5),sep="")
R> sub2 <- paste("id",c(1,4,8,9),sep="")
R> master
[1] "id1" "id2" "id3" "id4" "id5" "id6" "id7" "id8" "id9" "id10"
R> sub1
[1] "id2" "id3" "id5"
R> sub2
[1] "id1" "id4" "id8" "id9"
You can create a data frame from your master list of ids, and use these ids as row names :
R> df <- data.frame(master=master, row.names=master)
R> df
master
id1 id1
id2 id2
id3 id3
id4 id4
id5 id5
id6 id6
id7 id7
id8 id8
id9 id9
id10 id10
Then you can add new columns for each subset the following way :
R> df[sub1, "sub1"] <- sub1
R> df[sub2, "sub2"] <- sub2
With the following result :
R> df
master sub1 sub2
id1 id1 <NA> id1
id2 id2 id2 <NA>
id3 id3 id3 <NA>
id4 id4 <NA> id4
id5 id5 id5 <NA>
id6 id6 <NA> <NA>
id7 id7 <NA> <NA>
id8 id8 <NA> id8
id9 id9 <NA> id9
id10 id10 <NA> <NA>
精彩评论