Kruskal's algorithm and disjoint-set data structure: Do I need the following two lines of code?
I've implemented Kruskal's algorithm in C++ using the disjoint-set data structure according to Wikipedia like this:
#include <stdio.h>
#include <algorithm>
#define MAX_EDGES 10000000
#define MAX_VERTICES 200001
using namespace std;
int num_edges,num_vertices;
int total_cost=0;
struct edge{
int v1,v2;
int cost;
};
struct comp{
bool operator()(const edge& e1,const edge& e2){
retu开发者_如何学JAVArn e1.cost<e2.cost;
}
};
edge edges[MAX_EDGES];
int parent[MAX_VERTICES];
int rank[MAX_VERTICES];
int findset(int x){
if(x!=parent[x]){
parent[x]=findset(parent[x]);
}
return parent[x];
}
void merge(int x,int y){
int px=findset(x),py=findset(y);
if(rank[px]>rank[py]){
parent[py]=px;
}else{
parent[px]=py;
}
if(rank[px]==rank[py]){
++rank[py];
}
}
int main(){
FILE* in=fopen("input","r");
FILE* out=fopen("output","w");
fscanf(in,"%d %d\n",&num_vertices,&num_edges);
for(int i=1;i<=num_vertices;++i){
parent[i]=i;
rank[i]=0;
}
for(int i=0;i<num_edges;++i){
fscanf(in,"%d %d %d\n",&edges[i].v1,&edges[i].v2,&edges[i].cost);
}
sort(edges,edges+num_edges,comp());
for(int i=0;i<num_edges;++i){
int s1=findset(edges[i].v1),s2=findset(edges[i].v2);
if(s1!=s2){
merge(s1,s2);
total_cost+=edges[i].cost;
}
}
fprintf(out,"%d\n",total_cost);
}
My question is: Do I need these two lines of code? If so, what's their importance?
int px=findset(x),py=findset(y);
in merge instead ofint px=parent[x],py=parent[y];
parent[x]=findset(parent[x]);
in findset instead ofreturn findset(parent[x]);
1) findset(x)
returns the canonical representative of the set that x is in (the root of its ancestry tree). You need this to be able to compare whether two elements are in the same set or not (they have the same representative), parent[x]
just returns the parent of x in the tree, which may not be the root.
1a) You forgot to test for px and py being identical in merge
.
2) It's an optimization so that future calls to findset
will run faster. If parent[x]
used to point to its parent which pointed to the root of its set's tree, after this call parent[x]
will point directly to the root.
- You need this to because
x.parent
is not necessarily the representative of the class to whichx
belongs, so the algorithm wouldn't be correct without it. - Without the assignment, the algorithm would be suboptimal. This is the path compression optimization, also discussed in the Wikipedia.
精彩评论