Deep Reinforcement Learning

Deep reinforcement learning

Reinforcement learning (RL) deals with the sequential making of decisions. An agent interacts with its environment, which is modeled as a Markov Decision Process (MDP). Through its behavior in the environment, the agent receives reward signals, which it attempts to maximize via trial-and-error. Originally, RL algorithms were based on tabular structures, which quickly became a bottleneck for more complex problems. By adding artificial neural networks as a function approximation, tabular structures can be replaced in order to solve problems with numerous states and possible actions. In this context, we speak of deep reinforcement learning (DRL).

DRL already has many applications both in virtual worlds and in the real world. Exemplary areas of application are video games and robotics. Here is a small list of inspiring applications:

Hide and Seek: Emergent Tool Use from Multi-Agent Interaction
Obstacle Tower: A Generalization Challenge in Vision, Control and Planning
On the Verge of Solving Rocket League using Deep Reinforcement Learning and Sim-to-sim Transfer
DotA 2: OpenAI Five
AlphaStar: Mastering the real-time strategy game StarCraft II
Emergence of Locomotion Behaviors in Rich Environments
Learning Dexterity
Autonomous navigation of stratosphereic balloons using reinforcement learning
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Recommended literature

Reinforcement Learning: An Introduction
Grokking Deep Reinforcement Learning
Deep Reinforcement Learning Hands-On
Foundations of Deep Reinforcement Learning
AI for Games, Third Edition