Why does my model keep getting stuck in one spot? [closed]

Ask Question

Asked today

Modified today

Viewed 80 times

-2

Closed. This question needs debugging details. It is not currently accepting answers.

Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.

Closed 11 hours ago.

This post was edited and submitted for review 4 hours ago.

Improve this question

Problem Description The RL code implements a Gymnasium-compatible environment (DoomEnv) for training reinforcement learning agents in DOOM Retro. It captures game state (observations) via shared memory or screen capture, sends actions to control the game, and provides rewards. The issue is that the environment relies on DOOM Retro running in the background, and without it, it falls back to capturing the full monitor, which may not provide accurate game state. Additionally, the RL training and evaluation scripts need refinement for stability and performance.

Key challenges:

Dependency on DOOM Retro process. Shared memory integration for real-time state. Action space and reward shaping. Training convergence and evaluation metrics. This is WIP code, so bugs or incomplete features are expected.

Here's the core DoomEnv class (from doom_env.py). It's a Gymnasium environment that interfaces with DOOM Retro.

    import gymnasium as gym
from gymnasium import spaces
import numpy as np
import time
from controller.doom_controller import DoomController
from observation.observation_builder import ObservationBuilder
from rewards.reward_manager import RewardManager
from utils.shm_reader import ShmReader
from frame_cache import FrameCache

class DoomEnv(gym.Env):
    def __init__(self):
        super().__init__()
        self.action_space = spaces.Discrete(8)  # Example: 8 actions (move, turn, shoot, etc.)
        self.observation_space = spaces.Box(low=0, high=255, shape=(84, 84, 3), dtype=np.uint8)
        
        self.controller = DoomController()
        self.obs_builder = ObservationBuilder()
        self.reward_manager = RewardManager()
        self.shm_reader = ShmReader()
        self.frame_cache = FrameCache()
        
        self.prev_health = 100
        self.prev_ammo = 50
        self.prev_kills = 0

    def reset(self, seed=None, options=None):
        super().reset(seed=seed)
        self.controller.reset_game()
        obs = self._get_observation()
        info = {}
        return obs, info

    def step(self, action):
        self.controller.send_action(action)
        time.sleep(0.1)  # Frame delay
        
        obs = self._get_observation()
        reward = self._calculate_reward()
        done = self._is_done()
        truncated = False
        info = {}
        
        return obs, reward, done, truncated, info

    def _get_observation(self):
        # Try shared memory first
        if self.shm_reader.is_available():
            frame = self.shm_reader.get_frame()
        else:
            # Fallback to screen capture
            frame = self.frame_cache.capture_frame()
        
        # Process frame (resize, grayscale, etc.)
        processed = self.obs_builder.build_observation(frame)
        return processed

    def _calculate_reward(self):
        current_health = self.shm_reader.get_health() if self.shm_reader.is_available() else 100
        current_ammo = self.shm_reader.get_ammo() if self.shm_reader.is_available() else 50
        current_kills = self.shm_reader.get_kills() if self.shm_reader.is_available() else 0
        
        reward = self.reward_manager.calculate_reward(
            current_health, self.prev_health,
            current_ammo, self.prev_ammo,
            current_kills, self.prev_kills
        )
        
        self.prev_health = current_health
        self.prev_ammo = current_ammo
        self.prev_kills = current_kills
        
        return reward

    def _is_done(self):
        return self.shm_reader.get_health() <= 0 if self.shm_reader.is_available() else False

    def close(self):
        self.controller.close()

import sys
from os.path import dirname, abspath
sys.path.insert(0, dirname(dirname(abspath(__file__))))

from env.doom_env import DoomEnv

env = DoomEnv()

obs, _ = env.reset()

print(obs.shape)

obs, reward, done, truncated, info = env.step(0)

print("step working")

Required Data/Setup DOOM Retro binary: Built from the main repo (requires SDL2, CMake). IWAD file: e.g., freedoom1.wad or doom.wad. Python environment: Virtualenv with gymnasium, stable-baselines3, numpy, mss, pynput, torch, pillow. Shared memory: DOOM Retro must be running with RL support (writes to /dev/shm/doomretro_rl). No external data files needed for basic testing, but training uses trajectories/clips.

To run DOOM Retro:./build/doomretro -iwad /path/to/doom.wad

Obtained Output Ran python scripts/test_env.py without DOOM Retro running (fallback to monitor capture):

Connecting to existing DOOM window...DOOM Retro window not found!WARNING: DOOM window not found. Is DOOM Retro running?DOOM Retro window not found![frame_cache] DOOM window not found — capturing full monitor.(84, 84, 12)step working

obs.shape: (84, 84, 12) – Observation is a resized frame with 12 channels (likely RGB + extras). "step working": Indicates the step method executed without error. Expected Output With DOOM Retro running:

obs.shape: (84, 84, 3) or similar (processed game frame). Reward calculation based on health/ammo/kills changes. Shared memory used instead of monitor capture. No warnings about DOOM window not found. Full episode runs with proper done/truncated flags. The environment should integrate seamlessly with RL libraries like Stable Baselines3 for training agents that play DOOM.

edited 4 hours ago

asked 11 hours ago

Steven Harrison III

11 bronze badge

New contributor

1

Dumping the code on another site is not good enough. You need to provide a demonstration of the problem. If possible, that includes the code and anything other information or data needed, the obtained output and the expected output.

ikegami
– ikegami

2026-05-01 02:52:23 +00:00
Commented 8 hours ago

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Why does my model keep getting stuck in one spot? [closed]

0

Hot Network Questions