Playing Atari with Deep Reinforcement Learning
Deterministic Policy Gradient Algorithms
Trust Region Policy Optimization
Human-level control through deep reinforcement learning
Continuous control with deep reinforcement learning
Deep Reinforcement Learning with Double Q-learning
Prioritized Experience Replay
Mastering the game of Go with deep neural networks and tree search
Asynchronous Methods for Deep Reinforcement Learning
Hindsight Experience Replay
Proximal Policy Optimization Algorithms
Mastering the game of Go without human knowledge
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
Superhuman AI for multiplayer poker
Grandmaster level in StarCraft II using multi-agent reinforcement learning
Dota 2 with Large Scale Deep Reinforcement Learning