开发者

Manipulating a data frame with contents from a different data frame

I have a large data.frame with geographic places names (mydata). The plac开发者_高级运维es appear in the data.frame in more than one cell.

In an other file with 3 columns I have all these places (first column), the latitude of these places (second the places appear in the data.frame in more than one cell). In an other file with 3 columns I have all these places (first column), the latitude of these places (second column) and the longitude in the third column.

I want to create two other matrices (Lat and Long) with dimension equal to the dimension of the data.frame with the geographic places such that:

Lat[i,j] = Latitude of the place in mydata[i,j]
long[i,j]= longitude of the place in mydata[i,j]

So I am looking for a procedure that go through mydata, pick the name of the place in each cell, look for the Latitude and Longitude in the second file and fill these values in the matrices Lat and Long.

mydata <- data.frame(cbind(c("xyz","ab","yabc",NA)), 
                           c("xyz","xyz","yabc","ab")),
                           c("ab","ab",NA,"yabc")))
Coor   <- data.frame(cbind(c("ab","xyz","yabc"), 
                           c(31.34,42.15,36.98), 
                           c(12.87,13.67,18.56)))

I want a method to get

Lat <- data.frame(cbind(c(42.15,31.34,36.98,NA),
                        c(42.15,42.15,36.98,31.34),
                        c(31.34,31.34,NA,36.98)))


Something like this:

Lat<-do.call(cbind, lapply(mydata, function(curcol){ Coor[match(curcol, Coor[,1]), 2]  }))

That OK?


Here's a simple procedure the produces what you want. There may be a way to do this without an explicit for loop using *apply and indices, but this is readable.

#Define columns explicitly, avoiding cbind and the resulting coercion to characters
mydata <- data.frame(X1=c("xyz","ab","yabc",NA),X2= c("xyz","xyz","yabc","ab"), X3=c("ab","ab",NA,"yabc"))
Coor   <- data.frame(X1=c("ab","xyz","yabc"),X2=c(31.34,42.15,36.98),X3=c(12.87,13.67,18.56))

Lat <- data.frame(cbind( c(42.15,31.34,36.98,NA),c(42.15,42.15,36.98,31.34),c(31.34,31.34,NA,36.98)))

#Create the new lat/long matrices to hold the result
Lat1 <- matrix(NA,nrow=nrow(mydata),ncol=ncol(mydata))
Long1 <- matrix(NA,nrow=nrow(mydata),ncol=ncol(mydata))

for (i in 1:ncol(mydata)){
    Lat1[,i] <- Coor[match(mydata[,i],Coor$X1),2]
    Long1[,i] <- Coor[match(mydata[,i],Coor$X1),3]
}

Comparing your desired output:

Lat
     X1    X2    X3
1 42.15 42.15 31.34
2 31.34 42.15 31.34
3 36.98 36.98    NA
4    NA 31.34 36.98

Lat1
      [,1]  [,2]  [,3]
[1,] 42.15 42.15 31.34
[2,] 31.34 42.15 31.34
[3,] 36.98 36.98    NA
[4,]    NA 31.34 36.98

And this is what this solution produces in Long1:

Long1
      [,1]  [,2]  [,3]
[1,] 13.67 13.67 12.87
[2,] 12.87 13.67 12.87
[3,] 18.56 18.56    NA
[4,]    NA 12.87 18.56
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜