I have an AI that is good at playing Connect 4 (using minimax). Now I want to use some machine learning algorithm to learn from this AI that I have, and I would like to do that by just letting them pl
I just finished writing some code that runs a hebbian learning feedforward neural network. I\'ve done a backpropaga开发者_运维问答tion neural network before and the first thing I did to make sure it w
I am planning to use neural networks for approximating a value function in a reinforcement learning algorithm. I want to do that to introduce some generalization and flexibility on how I represent sta
What\'s the appropriate way to update your R(s) function during Q-learning? For example, say an agent visits state s1 five times, and receives rewards [0,0,1,1,0]. Shou开发者_StackOverflowld I calcula
I am doing my Masters project on robotic\'s sensorimotor online learning using reinforcement learning methods (Q,sarsa,TD(λ),Actor-Critic,R,etc). I am currently designing the framework on which both
I\'m currently trying to get an ANN to play a video game andand I was hoping to get some help from the wonderful community here.
I am having trouble understanding the SARSA algorithm: http://en.wikipedia.org/wiki/SARSA In particular, when updating the Q value what is gamma? a开发者_StackOverflow中文版nd what values are used fo
I\'ve started toying with reinforcement learning (using the Sutton book). I fail to fully understand is the paradox between having to reduce the markov state space while on the other hand not making a
I have an artificial neural network which plays Tic-Tac-Toe - but it is not complete yet. What I have yet:
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.