EC 700 A3, Spring 2021: Introduction to Reinforcement Learning
Course Description: Reinforcement learning is a subfield of artificial intelligence which deals with learning from repeated interactions with an environment. Reinforcement learning is the basis for state-of-the-art algorithms for playing strategy games such as Chess, Go, Backgammon, and Starcraft, as well as a number of problems throughout robotics, operations research, and other fields of engineering. In this course, we will study the fundamental principles of reinforcement learning. Our focus will be the main algorithms in the field: we’ll try to develop a mathematical understanding of why they work. While a moderate amount of coding will be required, the most important pre-requisite is mathematical maturity and a facility with constructing proofs.
Instructor: Alex Olshevsky
Time: MW 2:30-4:15
Tentative Plan:
- Markov Decision Process and Dynamic Programming, Value Iteration and Policy Iteration.
- Monte Carlo and Temporal Difference Methods. SARSA, DYNA, and Q-learning. Eligibility traces.
- Sample Complexity Issues.
- Using Deep Neural Networks for Approximation. Deep Q-learning. Approximation theorems, the benefits of depth, PAC-Bayes bounds for neural networks.
- Policy Gradient, Actor-Critic methods.
- Imitation learning, Inverse Reinforcement Learning, Distributed Reinforcement Learning (time permitting).
- Frontiers of RL research: using LSTM, Attention models, Transformers with reinforcement learning (time permitting).
Prerequisites: Some basic proficiency with Python. Strong familiarity with Markov chains. The most important prerequisite is mathematical maturity and a facility with constructing proofs.
Textbook: There will not be a single textbook four the course; rather, we will rely on lecture notes which will mix materials from different sources, most of which will come from research papers. In terms of textbooks, we’ll use bits and pieces from:
- Reinforcement Learning by Sutton and Barto. Available for free online.
- Foundations of Deep Reinforcement Learning by Graesser and Keng.
- Neuro-Dynamic Programming by Bertsekas and Tsitsiklis.
- Markov Decision Processes by Puterman.
- Dynamic Programming, Vol. II by Bertsekas.
Resources:
- A great set videos for CS 598 taught at UIUC.
- Lecture notes from a IEOR 8100 taught at Columbia.
- A set of lecture videos from CS 234 at Stanford.
- Lecture notes from MS&E 338 at Stanford.
- ECE 586 at UIUC, including a set of lecture notes.
- Working draft of a textbook by Agarwal, Jiang, Kakade, Sun.
- Reinforcement learning subreddit.
- Ten lecture videos for a class on RL by David Silver.
- Videos of CS 285 at Berkeley.
- Lecture videos for Underactuated Robotics at MIT. The second half of the course focuses on RL methods.
- Lecture videos for Advanced Robotics at Berkeley. A substantial part of the class is about RL.
- Deep RL bootcamp.
- Introduction to Reinforcement Learning from IIT.
Classic Papers:
- A Markovian Decision Process by Bellman (1957).
- Dynamic Programming by Bellman (1966).
- Neuron-Like Adaptive Elements that Can Solve Difficult Learning Control Problems by Barto, Sutton, and Andersen (1983).
- Learning to Predict by the Method of Temporal Differences by Sutton (1988).
- Q-Learning By Watkins and Dayan (1992).
- Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning by Williams (1992).
- Asynchronous Stochastic Approximation and Q-Learning by Tsitsiklis (1994).
- Temporal Difference Learning and TD-Gammon by Tesauro (1995).
- Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding by Sutton (1996).
- An Analysis of Temporal Differences Methods with Function Approximation by Tsitsiklis and Van Roy (1998).
- Policy Gradient Methods for Reinforcement Learning with Function Approximation by Sutton, McAllester, Singh, Mansour (2000).
- On Actor-Critic Methods by Konda and Tsitsiklis (2003).
- Playing Atari with Deep Reinforcement Learning by Mnih et al (2013).
- Mastering the Game of Go Without Human Knowledge by Silver et al (2017).
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm by Silver et al (2017).
Fall 2021: Flyer for EC 400, an undergraduate course in RL.