Model-Based vs Model-Free Reinforcement Learning
Model-Free RL
Characteristics
- Directly maps states to actions or values
- No explicit model of environment dynamics
- Learns through trial and error
Examples
- Q-Learning, DQN
- Policy Gradient methods
- Actor-Critic methods
Model-Based RL
Characteristics
- Learns a model of environment dynamics
- Uses model for planning or policy learning
- Can train in imagination
Examples
- World Models
- Dreamer
- MuZero
World Models as Model-Based RL
World Models represent a sophisticated model-based approach that:
- Learns a compressed representation (VAE)
- Models dynamics in latent space (MDN-RNN)
- Plans using the learned model (Controller)