开发者

How Random Walk works on a Graph??? and Why people use it?

I am a Phd student working in the area of graph mining. People have used the concept of random walk inside the graph while traversing and calculating the similarities among the nodes in the graph. Can anyone tell me that how ra开发者_如何学JAVAndom walk works on the graph? Specially, when it is utilized to measure any two arbitrary nodes/vertices in the graph...??? waiting for effective and informative reply... :roll:


Roughly speaking, if there are many possible paths between two nodes, then a random walk will be more likely between those two nodes, when compared to another pair of nodes with few possible paths between them. In this sense, the probability of a random walk between two nodes extends the similarity relationship to nodes that are not connected in the graph.

Two aspects are rather important. First, one usually considers a specific random graph, namely the graph obtained by normalising all edge (arc) weights outgoing from a node to sum to one. There are also approaches where some sampling procedure is performed using the original edge weights, but I find the explicity construction more useful. This results in a Markov graph which can just be thought of as a Markov matrix. Second, this normalisation approach changes the meaning of edge weights, in the sense that outliers can suddenly become close to other nodes. That is, if node A is connected with similarities 10 and 40 to nodes B and C (and no other nodes), and node Z is connected with similarities 1 and 4 to nodes B and C (and no other nodes), then both A and Z will end up with transition probabilities 0.2 and 0.8 to B and C respectively. One has to be careful with this.

An advantage of this approach is that edge weights are naturally accounted for; higher edge weights will translate to higher probabilities, and probabilities of walks of length longer than one fall out very naturally as multiplication of Markov matrices. In comparison, the total number of paths between two nodes (or a weighted version of this), computed without the random walk normalisation step, can blow up very quickly and be hugely skewed by local variations in edge or triangle density.

One algorithm that uses such a formulation is the clustering algorithm MCL. Another application is random walk betweenness, that can again be applied to use clustering. Label propagation methods seem to use a similar idea.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜