# RL MP4 Project

This folder contains a small unified project that integrates three RL algorithms (Policy Gradient, DQN, Actor-Critic) with a single Trainer that can train, evaluate, save logs (rewards/losses) and compare algorithms vertically.

Structure
- `models.py`: shared neural nets (PolicyNet, QNet, CriticNet)
- `replay_buffer.py`: replay buffer used by DQN
- `agents/`: agent implementations (`pg_agent.py`, `dqn_agent.py`, `ac_agent.py`)
- `trainer.py`: high-level Trainer to train/evaluate/compare
- `run.py`: simple CLI (train/eval/compare)
- `utils.py`: plotting helpers

Quick start
1. Make sure your python version is >=3.10. Install requirements:

   pip install -r requirements.txt

2. To have correct import, modify PYTHONPATH:

   export PYTHONPATH=<your abs path of deep_rl>:${PYTHONPATH}

3. Train one algorithm (example):

   python run.py --cmd train --algo pg --episodes 10 --lr 1e-2

4. Compare all three algorithms:

   python run.py --cmd compare

5. See advanced usage (e.g. Specifying hyperparameters through CLI, change output dir, etc.):

   python run.py --help

   Use bash script directly at `scripts`

Notes
- Saved outputs (plots and saved model files) are written to `./output` by default.
- For comparing different algorithms, passing hyperparameters of each algo is inconvenient through CLI. You can directly modify the default hyperparameer dict in `run.py` :)
- Feel free to change if you hope to add more infrastructures to facilitate your experiments (multiple runs with different seeds).

References
- OpenAI's gymnasium API: https://gymnasium.farama.org/api/env/