how to elegantly duplicate a graph (neural network)
I have a graph (network) which consists of layers, which contains nodes (neurons). I would like to write a procedure to duplicate entire graph in most elegant way possible -- i.e. with minimal or no overhead added to the structure of the node or layer.
Or yet in other words -- the procedure could be complex, but the complexity should not "leak" to structures. They should be no complex just because they are copyable.
I wrote the code in C#, so far it looks like this:
- neuron has additional field -- copy_of w开发者_运维问答hich is pointer the the neuron which base copied from, this is my additional overhead
- neuron has parameterless method Clone()
- neuron has method Reconnect() -- which exchanges connection from "source" neuron (parameter) to "target" neuron (parameter)
- layer has parameterless method Clone() -- it simply call Clone() for all neurons
- network has parameterless method Clone() -- it calls Clone() for every layer and then it iterates over all neurons and creates mappings neuron=>copy_of and then calls Reconnect to exchange all the "wiring"
I hope my approach is clear. The question is -- is there more elegant method, I particularly don't like keeping extra pointer in neuron class just in case of being copied! I would like to gather the data in one point (network's Clone) and then dispose it completely (Clone method cannot have an argument though).
Use a hash table for copying a general graph:
h = new HashTable()
def copyAll(node):
if h has key node: return h[node]
copy = node.copy()
h[node] = copy
for each successor of node:
copy.addSuccessor(copy(successor))
return copy
Your particular graph seems to be acyclic with special structure so you don't need a hash table (you can use an array instead) and the approach you are describing seems to be the best way to copy it.
If you are writing a neural network you should just use vectors and matrices of floats to represent the neurons. It may seem less elegant now, but trust me it's much more elegant (and several orders of magnitude faster too).
Consider a neural network with 2 layers, the input (n nodes) and the output (m nodes). Now suppose we have a vector of floats called in
that represents the values of the input layer, and we want to compute a vector called out
that represents the values of the output layer. The neural network itself consists of an n by m
matrix M
of floats. M[i][j]
represents how strong the connection between input node i
and output node j
is. The beauty is that evaluating a network is the same as matrix multiplication followed by applying the activation function to every element of the result vector:
out = f(M*in)
Where f
is the activation function and where *
is matrix multiplication. This is neural network evaluation in 1 line! You cannot get it this elegant with OO design of a neural network.
精彩评论