Timelines

Playing Atari with Deep Reinforcement Learning

Deterministic Policy Gradient Algorithms

Trust Region Policy Optimization

Human-level control through deep reinforcement learning

Continuous control with deep reinforcement learning

Deep Reinforcement Learning with Double Q-learning

Prioritized Experience Replay

Mastering the game of Go with deep neural networks and tree search

Asynchronous Methods for Deep Reinforcement Learning

Hindsight Experience Replay

Proximal Policy Optimization Algorithms

Mastering the game of Go without human knowledge

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

Superhuman AI for multiplayer poker

Grandmaster level in StarCraft II using multi-agent reinforcement learning

Dota 2 with Large Scale Deep Reinforcement Learning

How RL Evolved