开发者

improving performance for graph connectedness computation

I am writing a program to generate a graph and check whether it is connected or not. Below is the code. Here is some explanation: I generate a number of points on the plane at random locations. I then connect the nodes, NOT based on proximity only. By that I mean to say that a node is more likely to be connected to nodes that are closer, and this is determined by a random variable that I use in the code (h_sq) and the distance. Hence, I generate all links (symmetric, i.e., if i can talk to j the viceversa is also true) and then check with a BFS to see if the graph is connected.

My problem is that the code seems to be working properly. However, when the number of nodes becomes greater than ~2000 it is terribly slow, and I need to run this function many times for simulation purposes. I even tried to use other libraries for graphs but the performance is the same. Does anybody know how could I possibly speed everything up?

Thanks,

int Graph::gen_links() {
    if( save == true ) { // in case I want to store the structure of the graph
        links.clear();
        links.resize(xy.size());
    }

    double h_sq, d;
    vector< vector<luint> > neighbors(xy.size());

    // generate links
    double 开发者_JAVA百科tmp = snr_lin / gamma_0_lin;
    // xy is a std vector of pairs containing the nodes' locations
    for(luint i = 0; i < xy.size(); i++) {
        for(luint j = i+1; j < xy.size(); j++) {
            // generate |h|^2
            d = distance(i, j);
            if( d < d_crit ) // for sim purposes
                d = 1.0;
            h_sq = pow(mrand.randNorm(0, 1), 2.0) + pow(mrand.randNorm(0, 1), 2.0);
            if( h_sq * tmp  >= pow(d, alpha) ) {
                // there exists a link between i and j
                neighbors[i].push_back(j);
                neighbors[j].push_back(i);
                // options
                if( save == true )
                    links.push_back( make_pair(i, j) );
            }
        }
        if( neighbors[i].empty() && save == false  ) {
        // graph not connected. since save=false i dont need to store the structure, 
        // hence I exit
            connected = 0; 
            return 1;  
        }
    }

    // here I do BFS to check whether the graph is connected or not, using neighbors
    // BFS code...
    return 1;
}

UPDATE: the main problem seems to be the push_back calls within the inner for loops. It's the part that takes most of the time in this case. Shall I use reserve() to increase efficiency?


Are you sure the slowness is caused by the generation but not by your search algorithm?

The graph generation is O(n^2) and you can't do too much to it. However, you can apparently use memory in exchange of some of the time if the point locations are fixed for at least some of the experiments.

First, distances of all node pairs, and pow(d, alpha) can be precomputed and saved into memory so that you don't need to compute them again and again. The extra memory cost for 10000 nodes will be about 800mb for double and 400mb for float..

In addition, sum of square of normal variable is chi-square distribution if I remember correctly.. Probably you can have some precomputed table lookup if the accuracy allowed?

At last, if the probability that two nodes will be connected are so small if the distance exceeds some value, then you don't need O(n^2) and probably you can only calculate those node pairs that have distance smaller than some limits?


As a first step you should try to use reserve for both inner and outer vectors.

If this does not bring performance up to your expectations I believe this is because memory allocations that are still happening.

There is a handy class I've used in similar situations, llvm::SmallVector (find it in Google). It provides a vector with few pre-allocated items, so you can have decrease number of allocations by one per vector. It still can grow when it is running out of items in pre-allocated space.

So: 1) Examine the number of items you have in your vectors on average during runs (I'm talking about both inner and outer vectors) 2) Put in llvm::SmallVector with a pre-allocation of such size (as vector is allocated on the stack you might need to increase stack size, or reduce pre-allocation if you are restricted on available stack memory).

Another good thing about SmallVector is that it has almost the same interface as std::vector (could be easily put instead of it)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜