Kanto-RL

PokemonRL: Reinforcement Learning in Kanto

Pokemon Red reinforcement learning tooling built around a RAM-verified Gymnasium environment, configurable progress rewards, PPO training, and a real-time web dashboard.

Project Overview

PokemonRL is an educational and research-oriented framework for training agents to play Pokemon Red with reinforcement learning.

It is meant for:

students who want to learn RL through a classic game
beginners who want a concrete project to run and tweak
developers who care about Game Boy emulation and RAM-driven state

Main ideas:

PPO training via Stable Baselines3
observations based on actual RAM (inventory, map, HP)
reward shaping that favors progress over wandering
a web dashboard for live runs

Live dashboard

The latest public runs are here:

https://dolphindasher.github.io/Kanto-RL/

For students and beginners

If you are new to RL, this project is a good place to start. You can:

Observe how an agent goes from random actions to basic navigation.
Edit config/reward_presets.json and watch behavior changes.
Learn how a Game Boy game stores data like position and inventory.

Quickstart

This section aims for minimal guesswork.

Prerequisites

This repository does not include:

a Pokemon Red ROM (pokemon_red.gb)
a PyBoy save state (.state)

You must supply both for local training and playback.

Recommended local layout:

PokemonRL/
  roms/
    pokemon_red.gb
  saves/
    after_starter.state

1. Installation

Use Python 3.10 or newer.

python -m venv venv
venv\Scripts\activate
pip install --upgrade pip
pip install -e .

2. Configure paths

Set your ROM and save state paths:

set POKEMONRL_ROM_PATH=roms\pokemon_red.gb
set POKEMONRL_STATE_PATH=saves\after_starter.state

3. Verify setup

python -m unittest discover -s tests -p "test_*.py" -v
python tools\verify_ram.py

4. Start training and monitoring

python tools\train_ppo_multimodal.py --reward-profile config\brock_badge1_profile.json --timesteps 1000000 --num-envs 4
python tools\map_server.py

Then open:

http://127.0.0.1:5001/

Repository structure

PokemonRL/env/: emulator environment, RAM readers, and Gym wrappers
PokemonRL/rewards/: logic for computing rewards
tools/: scripts for training, watching, and the web dashboard
config/: JSON profiles for reward tuning and event flags
docs/: detailed guides on reward design and verification

Advanced features

For experienced users and researchers:

Feature	Description
Multimodal observations	Combines screen pixels with structured RAM data.
Custom reward profiles	Create objectives via JSON without touching code.
Path overrides	Control where models, logs, and runs are saved with env vars.

View all path overrides

| Variable | Default | Purpose | |---|---|---| | `POKEMONRL_PROJECT_ROOT` | repo root | Base directory for all relative project paths | | `POKEMONRL_CONFIG_DIR` | `config/` | Config JSON directory | | `POKEMONRL_MODELS_DIR` | `models/` | Saved SB3 checkpoints and policies | | `POKEMONRL_ROMS_DIR` | `roms/` | Local ROM directory | | `POKEMONRL_SAVES_DIR` | `saves/` | Local save-state directory | | `POKEMONRL_LOGS_DIR` | `logs/` | TensorBoard and monitor output | | `POKEMONRL_RUNS_DIR` | `runs/` | Live map and other run-time output |

Contributing

Contributions are welcome. If you fix a bug, add documentation, or suggest a new reward term, open a PR or issue.

See CONTRIBUTING.md.

License

This project is licensed under the MIT License.

This site is open source. Improve this page.