improving performance for graph connectedness computation

2023-02-25 04:01 问答作者：

I am writing a program to generate a graph and check whether it is connected or not. Below is the code. Here is some explanation: I generate a number of points on the plane at random locations. I then connect the nodes, NOT based on proximity only. By that I mean to say that a node is more likely to be connected to nodes that are closer, and this is determined by a random variable that I use in the code (h_sq) and the distance. Hence, I generate all links (symmetric, i.e., if i can talk to j the viceversa is also true) and then check with a BFS to see if the graph is connected.

My problem is that the code seems to be working properly. However, when the number of nodes becomes greater than ~2000 it is terribly slow, and I need to run this function many times for simulation purposes. I even tried to use other libraries for graphs but the performance is the same. Does anybody know how could I possibly speed everything up?

Thanks,

int Graph::gen_links() {
    if( save == true ) { // in case I want to store the structure of the graph
        links.clear();
        links.resize(xy.size());
    }

    double h_sq, d;
    vector< vector<luint> > neighbors(xy.size());

    // generate links
    double 开发者_JAVA百科tmp = snr_lin / gamma_0_lin;
    // xy is a std vector of pairs containing the nodes' locations
    for(luint i = 0; i < xy.size(); i++) {
        for(luint j = i+1; j < xy.size(); j++) {
            // generate |h|^2
            d = distance(i, j);
            if( d < d_crit ) // for sim purposes
                d = 1.0;
            h_sq = pow(mrand.randNorm(0, 1), 2.0) + pow(mrand.randNorm(0, 1), 2.0);
            if( h_sq * tmp  >= pow(d, alpha) ) {
                // there exists a link between i and j
                neighbors[i].push_back(j);
                neighbors[j].push_back(i);
                // options
                if( save == true )
                    links.push_back( make_pair(i, j) );
            }
        }
        if( neighbors[i].empty() && save == false  ) {
        // graph not connected. since save=false i dont need to store the structure, 
        // hence I exit
            connected = 0; 
            return 1;  
        }
    }

    // here I do BFS to check whether the graph is connected or not, using neighbors
    // BFS code...
    return 1;
}

UPDATE: the main problem seems to be the push_back calls within the inner for loops. It's the part that takes most of the time in this case. Shall I use reserve() to increase efficiency?

Are you sure the slowness is caused by the generation but not by your search algorithm?

The graph generation is O(n^2) and you can't do too much to it. However, you can apparently use memory in exchange of some of the time if the point locations are fixed for at least some of the experiments.

First, distances of all node pairs, and pow(d, alpha) can be precomputed and saved into memory so that you don't need to compute them again and again. The extra memory cost for 10000 nodes will be about 800mb for double and 400mb for float..

In addition, sum of square of normal variable is chi-square distribution if I remember correctly.. Probably you can have some precomputed table lookup if the accuracy allowed?

At last, if the probability that two nodes will be connected are so small if the distance exceeds some value, then you don't need O(n^2) and probably you can only calculate those node pairs that have distance smaller than some limits?

As a first step you should try to use reserve for both inner and outer vectors.

If this does not bring performance up to your expectations I believe this is because memory allocations that are still happening.

There is a handy class I've used in similar situations, llvm::SmallVector (find it in Google). It provides a vector with few pre-allocated items, so you can have decrease number of allocations by one per vector. It still can grow when it is running out of items in pre-allocated space.

So: 1) Examine the number of items you have in your vectors on average during runs (I'm talking about both inner and outer vectors) 2) Put in llvm::SmallVector with a pre-allocation of such size (as vector is allocated on the stack you might need to increase stack size, or reduce pre-allocation if you are restricted on available stack memory).

Another good thing about SmallVector is that it has almost the same interface as std::vector (could be easily put instead of it)

继续阅读：graph performance simulation

improving performance for graph connectedness computation

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？