Problem Description The RL code implements a Gymnasium-compatible environment (DoomEnv) for training reinforcement learning agents in DOOM Retro. It captures game state (observations) via shared memory or screen capture, sends actions to control the game, and provides rewards. The issue is that the environment relies on DOOM Retro running in the background, and without it, it falls back to capturing the full monitor, which may not provide accurate game state. Additionally, the RL training and evaluation scripts need refinement for stability and performance.
Key challenges:
Dependency on DOOM Retro process. Shared memory integration for real-time state. Action space and reward shaping. Training convergence and evaluation metrics. This is WIP code, so bugs or incomplete features are expected.
Here's the core DoomEnv class (from doom_env.py). It's a Gymnasium environment that interfaces with DOOM Retro.
import gymnasium as gym
from gymnasium import spaces
import numpy as np
import time
from controller.doom_controller import DoomController
from observation.observation_builder import ObservationBuilder
from rewards.reward_manager import RewardManager
from utils.shm_reader import ShmReader
from frame_cache import FrameCache
class DoomEnv(gym.Env):
def __init__(self):
super().__init__()
self.action_space = spaces.Discrete(8) # Example: 8 actions (move, turn, shoot, etc.)
self.observation_space = spaces.Box(low=0, high=255, shape=(84, 84, 3), dtype=np.uint8)
self.controller = DoomController()
self.obs_builder = ObservationBuilder()
self.reward_manager = RewardManager()
self.shm_reader = ShmReader()
self.frame_cache = FrameCache()
self.prev_health = 100
self.prev_ammo = 50
self.prev_kills = 0
def reset(self, seed=None, options=None):
super().reset(seed=seed)
self.controller.reset_game()
obs = self._get_observation()
info = {}
return obs, info
def step(self, action):
self.controller.send_action(action)
time.sleep(0.1) # Frame delay
obs = self._get_observation()
reward = self._calculate_reward()
done = self._is_done()
truncated = False
info = {}
return obs, reward, done, truncated, info
def _get_observation(self):
# Try shared memory first
if self.shm_reader.is_available():
frame = self.shm_reader.get_frame()
else:
# Fallback to screen capture
frame = self.frame_cache.capture_frame()
# Process frame (resize, grayscale, etc.)
processed = self.obs_builder.build_observation(frame)
return processed
def _calculate_reward(self):
current_health = self.shm_reader.get_health() if self.shm_reader.is_available() else 100
current_ammo = self.shm_reader.get_ammo() if self.shm_reader.is_available() else 50
current_kills = self.shm_reader.get_kills() if self.shm_reader.is_available() else 0
reward = self.reward_manager.calculate_reward(
current_health, self.prev_health,
current_ammo, self.prev_ammo,
current_kills, self.prev_kills
)
self.prev_health = current_health
self.prev_ammo = current_ammo
self.prev_kills = current_kills
return reward
def _is_done(self):
return self.shm_reader.get_health() <= 0 if self.shm_reader.is_available() else False
def close(self):
self.controller.close()
import sys
from os.path import dirname, abspath
sys.path.insert(0, dirname(dirname(abspath(__file__))))
from env.doom_env import DoomEnv
env = DoomEnv()
obs, _ = env.reset()
print(obs.shape)
obs, reward, done, truncated, info = env.step(0)
print("step working")
Required Data/Setup DOOM Retro binary: Built from the main repo (requires SDL2, CMake). IWAD file: e.g., freedoom1.wad or doom.wad. Python environment: Virtualenv with gymnasium, stable-baselines3, numpy, mss, pynput, torch, pillow. Shared memory: DOOM Retro must be running with RL support (writes to /dev/shm/doomretro_rl). No external data files needed for basic testing, but training uses trajectories/clips.
Required Data/Setup DOOM Retro binary: Built from the main repo (requires SDL2, CMake). IWAD file: e.g., freedoom1.wad or doom.wad. Python environment: Virtualenv with gymnasium, stable-baselines3, numpy, mss, pynput, torch, pillow. Shared memory: DOOM Retro must be running with RL support (writes to /dev/shm/doomretro_rl). No external data files needed for basic testing, but training uses trajectories/clips.
To run DOOM Retro:./build/doomretro -iwad /path/to/doom.wad
Obtained Output Ran python scripts/test_env.py without DOOM Retro running (fallback to monitor capture):
Connecting to existing DOOM window...DOOM Retro window not found!WARNING: DOOM window not found. Is DOOM Retro running?DOOM Retro window not found![frame_cache] DOOM window not found — capturing full monitor.(84, 84, 12)step working
obs.shape: (84, 84, 12) – Observation is a resized frame with 12 channels (likely RGB + extras). "step working": Indicates the step method executed without error. Expected Output With DOOM Retro running:
obs.shape: (84, 84, 3) or similar (processed game frame). Reward calculation based on health/ammo/kills changes. Shared memory used instead of monitor capture. No warnings about DOOM window not found. Full episode runs with proper done/truncated flags. The environment should integrate seamlessly with RL libraries like Stable Baselines3 for training agents that play DOOM.