Lecture 14 | Deep Reinforcement Learning - Stanford University School of Engineering - 深度學習 Deep Learning 公開課 - Cupoy
In Lecture 14 we move from supervised learning to reinforcement learning (RL), in which an agent mus...
In Lecture 14 we move from supervised learning to reinforcement learning (RL), in which an agent must learn to interact with an environment in order to maximize its reward. We formalize reinforcement learning using the language of Markov Decision Processes (MDPs), policies, value functions, and Q-Value functions. We discuss different algorithms for reinforcement learning including Q-Learning, policy gradients, and Actor-Critic. We show how deep reinforcement learning has been used to play Atari games and to achieve super-human Go performance in AlphaGo.
Keywords: Reinforcement learning, RL, Markov decision process, MDP, Q-Learning, policy gradients, REINFORCE, actor-critic, Atari games, AlphaGo
Slides: http://cs231n.stanford.edu/slides/201...