Subset a dataframe based on other vector
Here is my question, the first data frame is output inside a function which would be applied to bigger dataframe2, to subset it.
# dataframe1
loc <- c(paste('Loc', 1:9, sep = ''))
开发者_运维百科qit <- c(13, 27, 16, 14, 15, 21, 12, 11, 8)
mydf <- data.frame(loc, qit)
loc qit
1 Loc1 13
2 Loc2 27
3 Loc3 16
4 Loc4 14
5 Loc5 15
6 Loc6 21
7 Loc7 12
8 Loc8 11
9 Loc9 8
#dataframe 2
loc <- c(paste('Loc', 1:9, sep = ''))
vloc <- c(rep(loc, each=2))
allele <- c(
13, 12, 27, 20, 16, 18,
14, 17, 15, 22, 21, 26,
12, 14, 11, 18, 8, 24
)
afreq <- c( 0.308, 0.4, 0.041, 0.5, 0.125, 0.5,
0.139, 0.2, 0.219, 0.2,0.176, 0.33,
0.358, 0.4, 0.274, 0.5, 0.173, 0.15)
loctab <- data.frame(vloc, allele, afreq)
vloc allele afreq
1 Loc1 13 0.308
2 Loc1 12 0.400
3 Loc2 27 0.041
4 Loc2 20 0.500
5 Loc3 16 0.125
6 Loc3 18 0.500
7 Loc4 14 0.139
8 Loc4 17 0.200
9 Loc5 15 0.219
10 Loc5 22 0.200
11 Loc6 21 0.176
12 Loc6 26 0.330
13 Loc7 12 0.358
14 Loc7 14 0.400
15 Loc8 11 0.274
16 Loc8 18 0.500
17 Loc9 8 0.173
18 Loc9 24 0.150
What I want to make new dataframe like mydf with additional afreq variable from dataframe2. I tried to subset it:
loctab[loctab$allele %in% mydf$qit, ]
vloc allele afreq
1 Loc1 13 0.308
2 Loc1 12 0.400
3 Loc2 27 0.041
5 Loc3 16 0.125
7 Loc4 14 0.139
9 Loc5 15 0.219
11 Loc6 21 0.176
13 Loc7 12 0.358
14 Loc7 14 0.400
15 Loc8 11 0.274
17 Loc9 8 0.173
I did not get what I want. Here subset doesnot care about the vloc or loc variable. In this whenever it gets a match for all values in qit, will subset it. Is there anyway to subset by putting reference to loc or vloc.
Maybe the merge()
function is what you're looking for:
mydf2 <- merge(mydf,loctab,by.x = "qit", by.y = "allele")
You end up with 4 columns, but can then just get rid of the extra "vloc"
column.
精彩评论