Manipulating a data frame with contents from a different data frame
I have a large data.frame
with geographic places names (mydata
). The plac开发者_高级运维es appear in the data.frame
in more than one cell.
In an other file with 3 columns I have all these places (first column), the latitude of these places (second the places appear in the data.frame
in more than one cell). In an other file with 3 columns I have all these places (first column), the latitude of these places (second column) and the longitude in the third column.
I want to create two other matrices (Lat
and Long
) with dimension equal to
the dimension of the data.frame
with the geographic places such that:
Lat[i,j] = Latitude of the place in mydata[i,j]
long[i,j]= longitude of the place in mydata[i,j]
So I am looking for a procedure that go through mydata
, pick the name of the place in each cell, look for the Latitude and Longitude in the second file and fill these values in the matrices Lat
and Long
.
mydata <- data.frame(cbind(c("xyz","ab","yabc",NA)),
c("xyz","xyz","yabc","ab")),
c("ab","ab",NA,"yabc")))
Coor <- data.frame(cbind(c("ab","xyz","yabc"),
c(31.34,42.15,36.98),
c(12.87,13.67,18.56)))
I want a method to get
Lat <- data.frame(cbind(c(42.15,31.34,36.98,NA),
c(42.15,42.15,36.98,31.34),
c(31.34,31.34,NA,36.98)))
Something like this:
Lat<-do.call(cbind, lapply(mydata, function(curcol){ Coor[match(curcol, Coor[,1]), 2] }))
That OK?
Here's a simple procedure the produces what you want. There may be a way to do this without an explicit for
loop using *apply
and indices, but this is readable.
#Define columns explicitly, avoiding cbind and the resulting coercion to characters
mydata <- data.frame(X1=c("xyz","ab","yabc",NA),X2= c("xyz","xyz","yabc","ab"), X3=c("ab","ab",NA,"yabc"))
Coor <- data.frame(X1=c("ab","xyz","yabc"),X2=c(31.34,42.15,36.98),X3=c(12.87,13.67,18.56))
Lat <- data.frame(cbind( c(42.15,31.34,36.98,NA),c(42.15,42.15,36.98,31.34),c(31.34,31.34,NA,36.98)))
#Create the new lat/long matrices to hold the result
Lat1 <- matrix(NA,nrow=nrow(mydata),ncol=ncol(mydata))
Long1 <- matrix(NA,nrow=nrow(mydata),ncol=ncol(mydata))
for (i in 1:ncol(mydata)){
Lat1[,i] <- Coor[match(mydata[,i],Coor$X1),2]
Long1[,i] <- Coor[match(mydata[,i],Coor$X1),3]
}
Comparing your desired output:
Lat
X1 X2 X3
1 42.15 42.15 31.34
2 31.34 42.15 31.34
3 36.98 36.98 NA
4 NA 31.34 36.98
Lat1
[,1] [,2] [,3]
[1,] 42.15 42.15 31.34
[2,] 31.34 42.15 31.34
[3,] 36.98 36.98 NA
[4,] NA 31.34 36.98
And this is what this solution produces in Long1
:
Long1
[,1] [,2] [,3]
[1,] 13.67 13.67 12.87
[2,] 12.87 13.67 12.87
[3,] 18.56 18.56 NA
[4,] NA 12.87 18.56
精彩评论