2,587 questions
158
votes
8
answers
93k
views
What is the difference between Q-learning and SARSA?
Although I know that SARSA is on-policy while Q-learning is off-policy, when looking at their formulas it's hard (to me) to see any difference between these two algorithms.
According to the book ...
149
votes
6
answers
42k
views
How to train an artificial neural network to play Diablo 2 using visual input? [closed]
I'm currently trying to get an ANN to play a video game and and I was hoping to get some help from the wonderful community here.
I've settled on Diablo 2. Game play is thus in real-time and from an ...
147
votes
5
answers
118k
views
What is the difference between value iteration and policy iteration? [closed]
In reinforcement learning, what is the difference between policy iteration and value iteration?
As much as I understand, in value iteration, you use the Bellman equation to solve for the optimal ...
68
votes
2
answers
32k
views
Training a Neural Network with Reinforcement learning
I know the basics of feedforward neural networks, and how to train them using the backpropagation algorithm, but I'm looking for an algorithm than I can use for training an ANN online with ...
59
votes
4
answers
50k
views
What is the way to understand Proximal Policy Optimization Algorithm in RL?
I know the basics of Reinforcement Learning, but what terms it's necessary to understand to be able read arxiv PPO paper ?
What is the roadmap to learn and use PPO ?
55
votes
6
answers
54k
views
How can I apply reinforcement learning to continuous action spaces?
I'm trying to get an agent to learn the mouse movements necessary to best perform some task in a reinforcement learning setting (i.e. the reward signal is the only feedback for learning).
I'm hoping ...
47
votes
3
answers
47k
views
What is a policy in reinforcement learning? [closed]
I've seen such words as:
A policy defines the learning agent's way of behaving at a given time. Roughly
speaking, a policy is a mapping from perceived states of the environment to actions to be ...
43
votes
3
answers
37k
views
What is the difference between Q-learning and Value Iteration?
How is Q-learning different from value iteration in reinforcement learning?
I know Q-learning is model-free and training samples are transitions (s, a, s', r). But since we know the transitions and ...
38
votes
1
answer
39k
views
OpenAI Gym: Understanding `action_space` notation (spaces.Box)
I want to setup an RL agent on the OpenAI CarRacing-v0 environment, but before that I want to understand the action space. In the code on github line 119 says:
self.action_space = spaces.Box( np....
37
votes
2
answers
20k
views
What is the difference between reinforcement learning and deep RL? [closed]
What is the difference between deep reinforcement learning and reinforcement learning? I basically know what reinforcement learning is about, but what does the concrete term deep stand for in this ...
34
votes
5
answers
15k
views
When should I use support vector machines as opposed to artificial neural networks?
I know SVMs are supposedly 'ANN killers' in that they automatically select representation complexity and find a global optimum (see here for some SVM praising quotes).
But here is where I'm unclear --...
34
votes
4
answers
22k
views
Openai gym environment for multi-agent games
Is it possible to use openai's gym environments for multi-agent games? Specifically, I would like to model a card game with four players (agents). The player scoring a turn starts the next turn. How ...
32
votes
1
answer
38k
views
Pytorch ValueError: optimizer got an empty parameter list
When trying to create a neural network and optimize it using Pytorch, I am getting
ValueError: optimizer got an empty parameter list
Here is the code.
import torch.nn as nn
import torch.nn....
30
votes
2
answers
36k
views
Tensorflow and Multiprocessing: Passing Sessions
I have recently been working on a project that uses a neural network for virtual robot control. I used tensorflow to code it up and it runs smoothly. So far, I used sequential simulations to evaluate ...
27
votes
6
answers
38k
views
NameError: name 'base' is not defined OpenAI Gym
[Note that I am using xvfb-run -s "-screen 0 1400x900x24" jupyter notebook]
I try to run a basic set of commands in OpenAI Gym
import gym
env = gym.make("CartPole-v0")
obs = env.reset()
env.render()
...