Site Search  






Teaching » Syllabus: Reinforcement Learning and Learning Control

All downloadable documents are Adobe Acrobat PDF documents. You can obtain Acrobat for free by following the link from the Adobe Icon.

Note: This syllabus will be modified continuously to accommodate the progress and interests of the course participants!

Sept. 3Introduction to Reinforcement LearningSlides, Sutton Book Chapters 1-5
Sept. 10Function Approximation in Reinforcement Learning,
Optimal control along trajectories: LQR, LQG and DDP
Sutton Book Chapter 8, Todorov2005
Sept. 17Research on DDP and Function Approximation for RLTassa2007, Slides
Sept. 24Research on DDP and Function Approximation in RLDoya2000, Morimoto2003
Oct., 1Gaussian Processes for Reinforcement Learning,
Value function learning along trajectories (fitted Q iteration),
Least Squares Temporal Difference Methods
Deisenroth2009, Lagoudakis2002, Ernst2005
Oct.. 8Policy Gradient Methods: REINFORCE, GPOMDP, Natural GradientsWilliams1992, Sutton2000, Peters2008, Slides
Oct.. 15Research on Policy Gradient Methods, Introduction to Path Integral MethodsTedrake2005, Bagnell2003
Oct. 22Path Integral Methods for Reinforcement LearningTheodorou2010, Todorov2009, Kober2009
Oct. 29Path Integral Methods for Reinforcement Learning (continued)Slides
Nov. 5Sketch of Planned Projects, Modular Learning ControlTedrake2009, Todorov2009
Nov. 12Inverse reinforcement learningDvijotham2009, Abbeel2009, Ratliff2009
Nov. 19Dynamic Bayesian networks for reinforcement learningToussaint2006, Vlassis2009
Dec. 3Project presentations. 

Tentative Syllabus:

  • Introduction to reinforcement learning [1]
  • Dynamic programming methods [1, 2]
  • Optimal control methods [2, 3]
  • Temporal difference methods [1]
  • Q-Learning [1]
  • Problems of value-function-based RL methods
  • Function Approximation for RL [1]
  • Incremental Function Approximation Methods for RL [4, 5]
  • Least Squares Methods [6]
  • Direct Policy Learning: REINFORCE [7]
  • Modern policy gradient methods: GPOMDP and the Policy Gradient Theo-rem [8, 9]
  • Natural Policy Gradient Methods [9]
  • Prob. Reinforcement Learning with Reward Weighted Averaging [10, 11]
  • Q-Learning on Trajectories [12]
  • Path Integral Approaches to Reinforcement Learning I [13]
  • Path Integral Approaches to Reinforcement Learning II
  • Dynamic Bayesian Networks for RL [14]
  • Gaussian Processes in Reinforcement Learning [5]


  1. (:titlesearch Reinforcement learning : An introduction :)
  2. (:titlesearch The computation and theory of optimal control :)
  3. (:titlesearch Differential dynamic programming :)
  4. (:titlesearch Constructive incremental learning from only local information :)
  5. (:titlesearch Gaussian processes for machine learning :)
  6. J. Boyan, "Least-squares temporal difference learning," in In Proceedings of the Sixteenth International Conference on Machine Learning: Morgan Kaufmann, 1999, pp. 49-56.

  7. (:titlesearch Simple statistical gradient-following algorithms for connectionist reinforcement learning :)
  8. (:titlesearch Reinforcement learning of motor skills with policy gradients :)
  9. (:titlesearch Natural actor critic :)
  10. (:titlesearch Reinforcement learning by reward-weighted regression for operational space control :)
  11. (:titlesearch Policy Search for Motor Primitives in Robotics :)
  12. (:titlesearch Fitted Q-iteration by advantage weighted regression :)
  13. (:titlesearch Path integral stochastic optimal control for rigid body dynamics :)
  14. (:titlesearch Probabilistic inference for solving discrete and continuous state Markov Decision Processes :)
Designed by: Nerses Ohanyan & Jan Peters
Page last modified on January 12, 2012, at 05:38 PM