Let\'s imagine we have an (x,y) plane where a robot can move. Now we define the middle of our world as the goal state, which means that we are going to give a reward of 100 to our robot once it reache
What difference to the algorithm does it make having a big or small gamma value? In my optic, as long as it is neither 0 or 1, it should work exactly the same. On the other si开发者_运维问答de, whatev
Let\'s assume we\'re in a room where our agent can move along the xx and yy axis. At each point he can move up, down, right and left. So our state space can be defined by (x, y) and our actions at eac
I do know that feedforward multi-layer neural networks with backprop are used with Reinforcement Learning as to help it generalize the actions our agent does. This is, if we have a big state space, we
I am currently using Q-Learning to try to teach a bot how to move in a room filled with walls/obstacles. It must start in any place in the room and get to the goal state(this might be, to the开发者_如
I have to do some work with Q Learning, about a guy that has to move furniture around a house (it\'s basically that). If the house is small enough, I can just have a matrix that represents actions/rew
Closed. This question is not about programming or software development. It is not currently accepting answers.