开发者

Problem with simple artificial neural network -- adding

I am trying to make a simple artificial neural network work with the backpropagation algorithm. I have created an ANN and I believe I have implemented the BP algorithm correctly, but I may of course be wrong.

Right now, I am trying to train the network by giving it two random numbers (a, b) between 0 and 0.5, and having it add them. Then, of course, each time the output the network gives is compared to the theoretical answer of a + b (which will always be achievable by the sigmoid function).

Strangely, the output always converges to a number between 0 and 1 (as it must, because of the sigmoid function), but the random numbers I'm putting in seem to have no effect on it.

Edit: Sorry, it appears it doesn't converge. Here is an image of the output:

Problem with simple artificial neural network -- adding

The weights are randomly distributed between -1 and 1, but I have also tried between 0 and 1.

I also tried giving it two constant numbers (0.35,0.9) and trying to train it to spit out 0.5. This works and converges very fast to 0.5. I have also trained it to spit out 0.5 if I give it any two random numbers between 0 and 1, and this also works.

If instead, my target is:

vector<double> target;
target.push_back(.5);

Then it converges very quickly, even with random inputs:

Problem with simple artificial neural network -- adding

I have tried a couple different networks, since I made it very easy to add layers to my network. The standard one I am using is one with two inputs, one layer of 2 neurons, and a second layer of only one neuron (the output neuron). However, I have also tried adding a few layers, and adding neurons to them. It doesn't seem to change anything. My learning rate is equal to 1.0, though I tried it equal to 0.5 and it wasn't much different.

Does anyone have any idea of anything I could try?

Is this even something an ANN is capable of? I can't imagine it wouldn't be, since they can be trained to do such complicated things.

Any advice? Thanks!

Here is where I train it:

//Initialize it. This will be one with 2 layers, the first having 2 Neurons and the second (output layer) having 1.
vector<int> networkSize;
networkSize.push_back(2);
networkSize.push_back(1);
NeuralNetwork myNet(networkSize,2);

for(int i = 0; i<5000; i++){
    double a = randSmallNum();
    double b = randSmallNum();
    cout << "\n\n\nInputs: " << a << ", " << b << " with expected target: " << a + b;

    vector<double>开发者_如何学JAVA; myInput;
    myInput.push_back(a);   
    myInput.push_back(b);   

    vector<double> target;
    target.push_back(a + b);

    cout << endl << "Iteration " << i;
    vector<double> output = myNet.backPropagate(myInput,target);
    cout << "Output gotten: " << output[0];
    resultPlot << i << "\t" << abs(output[0] - target[0]) << endl;
}

Edit: I set up my network and have been following from this guide: A pdf. I implemented "Worked example 3.1" and got the same exact results they did, so I think my implementation is correct, at least as far as theirs is.


As @macs states, the maximum output of standard sigmoid is 1, so, if you try to add n numbers from [0, 1], then your target should be normalized, i.e. sum(A1, A2, ..., An) / n.


In a model such as this, the sigmoid function (both in the output and in the intermediate layers) is used mainly for producing something that resembles a 0/1 toggle while still being a continuous function, so using it to represent a range of numbers is not what this kind of network is designed to do. This is because it is designed mostly with classification problems in mind. There are, of course, other NN models that can do that sort of thing (for example, dropping the sigmoid on the output and just keeping it as a sum of its children).

If you can redefine your model in terms of classifying the input, you'll probably get better results.

Some examples of similar tasks for which the network will be more suitable:

  1. Test whether the output is bigger or smaller than a certain constant - this should be very easy.
  2. Output: A series of outputs, each representing a different potential value (for example, one output each for the the values between 0 and 10, one for 'more than 10', and one for 'less than 0'). You will want your network to round the result to the nearest integer
  3. A tricky one will be to try and create a boolean representation of the output by having multiple output nodes.

None of these will give you the precision that you may want, though, since by nature NNs are more 'fuzzy'

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜