Introduction to World Models

What are World Models?

World Models are a class of AI systems that learn an internal representation of their environment. Instead of directly mapping observations to actions, these models first build a compressed, predictive model of the world, then use this model to make decisions.

The concept was popularized by the seminal paper "World Models" by David Ha and Jürgen Schmidhuber in 2018, though the underlying ideas trace back to earlier work in cognitive science and reinforcement learning.

Why World Models Matter

Traditional reinforcement learning agents learn through direct interaction with their environment, which can be:

Sample inefficient: Requiring millions of interactions
Computationally expensive: Each interaction needs real-time processing
Risky: Learning in the real world can be dangerous

World Models address these issues by:

Learning a compressed representation of observations
Building a predictive model of environment dynamics
Training in imagination: Using the learned model to simulate experiences

The V-M-C Architecture

The World Models framework consists of three key components:

Vision Model (V)

Uses a Variational Autoencoder (VAE)
Compresses high-dimensional observations into a compact latent vector z
Captures essential visual features while discarding noise

Memory Model (M)

Implemented as an MDN-RNN (Mixture Density Network - Recurrent Neural Network)
Predicts future latent states based on current state and action
Maintains temporal context through hidden state h

Controller (C)

A simple linear model
Maps the combined representation (z, h) to actions
Trained using evolution strategies (CMA-ES)

Key Insights

The revolutionary insight is that once a World Model is trained, the agent can:

Dream: Generate imaginary experiences
Plan: Evaluate potential actions without real-world consequences
Transfer: Apply learned knowledge to new situations

Applications

World Models have been successfully applied to:

Game playing: Car Racing, VizDoom
Robotics: Simulated manipulation tasks
Planning: Long-horizon decision making
Generative modeling: Creating realistic environment simulations