开发者

Where does the verification data go when training an ANN?

The need for having part of the training set used as verification data is straightforward, but I am not really clear on how and at what stage of the training should it be incoperated?

Is it at the end of the training (after reaching a good minimum for the training data)? If so, what should be done if the ve开发者_开发技巧rification data yeilds a big error?

Is it throughout the training (keep looking for a minimum while errors for both the training and verification data aren't satisfactory)?

No matter what I try it seems that the network is having a trouble to learn both training and verification when the verification set reaches a certain size (I recall reading somewhere that 70% training 30% verification is a common ratio, I get stuck at a much smaller one), while it has no problem to learn the same data when used entirely for training.


The important thing is that your verification set must have no feedback on the training. You can plot the error rate on the verification set, but the training algorithm can only use the error rate on the training set to correct itself.


The validation data set is mostly used for early stopping.

  1. Train network for epoch i on test data. Let test eerror be e(t, i).
  2. Evaluate network on validation set. Let that be e(v, i).
  3. If e(v, i) > e(v, i-1) stop training. Else goto 1.

So it helps you to see, when the network overfits, which means that it models the specifics of the test data too much. The idea is that with an ANN, you want to achieve good generalization from training data to unseen data. The validation set helps you to determine, when the point is reached when it specializes too much on the training data.


means that Over-Training i advise check a verification set' MSE during training see Overtraining Caution System of FannTool http://fanntool.googlecode.com/files/FannTool_Users_Guide.zip

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜