Social networks and genetic algorithms in R
I am trying to implement a network core-periphery measure from an article (link: Borgatti & Everett 2000) in R. The basic approach applied by the authors is to:
Arrange the rows and columns of the network matrix so that actors that are well connected to each other occupy the top left corner.
Create an ideal pa开发者_如何学编程ttern matrix based on the row/column arrangement in step 1
Assess the correlation between the two matrices
According to the authors the trick in step one is to find the row/column arrangement of the matrix that correlates the highest with its induced pattern matrix, and they recommend using a genetic algorithm to find the best row/column arrangement. I am stuck at the first steps of the algorithm:
How do I in R create random row/column matrix arrangements that preserve the order of the column/row entries?
Once I have assessed the fit between the matrix arrangements and the patterns matrices, how do I "breed" new matrix arrangements based on the "fittest" matrices?
Thanks.
Given a matrix of a particular size, you can generate random row/column matrix arrangements as such
#create fake data
mydata.block1 <- matrix (rep(1, times=100), ncol=10)
mydata.block2 <- matrix (rep(0, times=900), ncol=90)
mydata.block3 <- matrix (rep(0, times=900), ncol=10)
mydata.block4 <- matrix (rep(1, times=8100), ncol=90)
mydata <- rbind(cbind (mydata.block1, mydata.block2), cbind (mydata.block3, mydata.block4))
#Mix mydata
mix.order <- sample(1:dim (mydata)[1])
mydata <- mydata[mix.order,mix.order]
#create 100 random orderings
##preallocate matrix
rand.samp <- matrix (rep(NA, times=10000), ncol=100)
##create orderings
for (i in 1:100){
rand.samp[i,] <- sample(1:dim (mydata)[1])
}
##Eliminate duplicate orderings (unlikely to occur)
rand.samp <- unique (rand.samp)
#Reorder and measure fitness
##preallocate fitness measure
fit.meas <- rep (NA, times=100)
for (i in 1:100){
mydata.reordered <- mydata[rand.samp[i,],rand.samp[i,]]
fit.meas[i] <- myfitnessfunc(mydata.reordered)
}
After you have measured fitness, you will need some way to determine which areas are contributing to the fitness and fix those while altering other areas ("breed"). Perhaps dist() will be of some use. Maybe a heatmap or clustering, hclust(), would also be of use? Can you provide more detail on how you would determine localized fitness?
OneWhoIsUnnamed response is the same as I interpreted your need for #1.
Here's a fitness based recombination method for two adjacency matrices, #2:
Say you have two matrices, A and B, who have fitness cores Fa and Fb of 2.3 and 1.1, respectively. Breed the matrices by constructing a new matrix, C, where C_{i} = A_{i} with probability Fa/(Fa+Fb) or C_{i} = B_{i} with probability 1-Fa/(Fa+Fb). This is just one of unlimited ways of breeding matrices. M is the mated result of A and B based on their fitness.
# lets define a function to create random adjacency matrices
random_adjacent <- function(dimension)
{
ret <- matrix(runif(dimension^2)>0.5,dimension,dimension)
retl <- ret * lower.tri(ret)
return( retl + t(retl) )
}
# set fitness
Fa <- 2.3
Fb <- 1.1
# initialize matrices
A <- random_adjacent(4)
B <- random_adjacent(4)
# compute symmetric fitness probability matrix
C <- matrix(runif(16)<Fa/(Fa+Fb),4,4)
Cl <- C * lower.tri(C) # take the lower triangular portion
C <- Cl + t(Cl) # reflect the lower triangular portion into the upper
# compute mated result
M <- matrix(0,4,4)
M[C] <- A[C]
M[!C] <- B[!C]
精彩评论