Newest 'sarsa' Questions - Stack Overflow

2 votes

0 answers

204 views

Implementing Sarsa(lambda) - Gridworld - in Julia language

Could you explain me what is wrong in this code ? I am trying to implement SARSA(lamda) with eligibility traces. using ReinforcementLearningBase, GridWorlds using PyPlot world = GridWorlds....

przel123

21

asked Dec 26, 2022 at 23:29

0 votes

1 answer

294 views

Problem with Deep Sarsa algorithm which work with pytorch (Adam optimizer) but not with keras/Tensorflow (Adam optimizer)

I have a deep sarsa algorithm which work great on Pytorch on lunar-lander-v2 and I would use with Keras/Tensorflow. It use mini-batch of size 64 which are used 128 time to train at each episode. There ...

rdpdo

33

asked Sep 14, 2022 at 20:59

0 votes

1 answer

274 views

Helipad Co-ordinates of LunarLander v2 openai gym

I am trying to implement a custom lunar lander environment by taking help from already existing LunarLanderv2. https://github.com/openai/gym/blob/master/gym/envs/box2d/lunar_lander.py I'm having a ...

Shan

1

asked Sep 23, 2021 at 6:54

1 vote

1 answer

1k views

Implementing SARSA from Q-Learning algorithm in the frozen lake game

I am solving the frozen lake game using Q-Learning and SARSA algorithms. I have the code implementation of the Q-Learning algorithm and that works. This code was taken from Chapter 5 of "Deep ...

ronanwa

11

asked Jun 24, 2021 at 18:01

0 votes

1 answer

98 views

Converting to Python scalars

I am implementing a SARSA reinforcement learning function which chooses an action following the same current policy updates its Q-values. This throws me the following error: TypeError: only size-1 ...

matheo-es

33

asked Dec 10, 2020 at 16:56

0 votes

1 answer

658 views

SARSA implementation with tensorflow

I try to learn the concept of reinforcement learning at the moment. Hereby, I tried to implement the SARSA algorithm for the cart pole example using tensorflow. I compared my algorithm to algorithms ...

Ralf

73

asked Dec 9, 2020 at 14:15

1 vote

0 answers

71 views

Can not save Sarsa in Accord.NET

I'm pretty new to Unity and Accord.Net but I'm currently making a small game in Unity and decided to see what I could do with some reinforcement learning to make it more interesting. Everything has ...

earlyLo

11

asked Feb 15, 2020 at 17:28

2 votes

0 answers

508 views

is this true ? what about Expected SARSA and double Q-Learning?

I‘m studying Reinforcement Learning and I’m facing a problem understanding the difference between SARSA, Q-Learning, expected SARSA, Double Q Learning and temporal difference. Can you please explain ...

Cooper

25

asked Mar 27, 2019 at 19:38

0 votes

1 answer

337 views

Teach robot to collect items in grid world before reach terminal state by using reinforcement learning

My problem is the following. I have a simple grid world: https://i.sstatic.net/xrhJw.png The agent starts at the initial state labeled with START, and the goal is to reach the terminal state labeled ...

Genesist

53

asked Feb 2, 2019 at 14:50

1 vote

1 answer

415 views

Eligibility trace algorithm, the update order

I am reading Silver et al (2012) "Temporal-Difference Search in Computer Go", and trying to understand the update order for the eligibility trace algorithm. In the Algorithm 1 and 2 of the paper, ...

Kota Mori

6,762

asked Oct 15, 2018 at 0:06

2 votes

1 answer

3k views

Sarsa and Q Learning (reinforcement learning) don't converge optimal policy

I have a question about my own project for testing reinforcement learning technique. First let me explain you the purpose. I have an agent which can take 4 actions during 8 steps. At the end of this ...

T.L

21

asked Oct 11, 2018 at 7:01

3 votes

1 answer

510 views

SARSA value approximation for Cart Pole

I have a question on this SARSA FA. In input cell 142 I see this modified update w += alpha * (reward - discount * q_hat_next) * q_hat_grad where q_hat_next is Q(S', a') and q_hat_grad is the ...

Chuk Lee

3,608

asked Jul 17, 2018 at 1:26

0 votes

1 answer

336 views

Implementing SARSA in Unity

So I've used following code to implement Q-learning in Unity: using System; using System.Collections; using System.Collections.Generic; using System.Linq; using UnityEngine; namespace QLearner { ...

user3631213

13

asked Jun 8, 2018 at 12:43

6 votes

1 answer

4k views

Why is there no n-step Q-learning algorithm in Sutton's RL book?

I think I am messing something up. I always thought that: - 1-step TD on-policy = Sarsa - 1-step TD off-policy = Q-learning Thus I conclude: - n-step TD on-policy = n-step Sarsa - n-step TD off-...

siva

1,583

asked Apr 13, 2018 at 17:10

0 votes

1 answer

82 views

Zeta Variable of SARSA(lamda)

What does zeta represent in the critic method? I believe it keeps track of the state-action pairs and represents eligibility traces, which are a temporary record of the state-actions, but what exactly ...

anon

560

asked Apr 12, 2018 at 1:31

Collectives™ on Stack Overflow

Implementing Sarsa(lambda) - Gridworld - in Julia language

Problem with Deep Sarsa algorithm which work with pytorch (Adam optimizer) but not with keras/Tensorflow (Adam optimizer)

Helipad Co-ordinates of LunarLander v2 openai gym

Implementing SARSA from Q-Learning algorithm in the frozen lake game

Converting to Python scalars

SARSA implementation with tensorflow

Can not save Sarsa in Accord.NET

is this true ? what about Expected SARSA and double Q-Learning?

Teach robot to collect items in grid world before reach terminal state by using reinforcement learning

Eligibility trace algorithm, the update order

Sarsa and Q Learning (reinforcement learning) don't converge optimal policy

SARSA value approximation for Cart Pole

Implementing SARSA in Unity

Why is there no n-step Q-learning algorithm in Sutton's RL book?

Zeta Variable of SARSA(lamda)

Hot Network Questions