winter semester 2010/2011, Friday, 10.15 s.t., room ND 6/99

The lecture provides an introduction to reinforcement learning (RL). It covers the basics of RL, including Markov decision processes, dynamic programming, and temporal-difference learning. Additionally, special topics such as policy gradient methods as well as the connection evolutionary computation are presented.

Very basic knowledge in statistics and analysis is required, basic knowledge about neural networks is helpful but not necessary.

The second written exam (**Nachklausur**) will take place in NA 2/99, Mo 10.08.2011 at 9.00-11.00 o'clock.

The results of the exam are finally there. I apologize for the long delay.

Correcting them will take a bit longer than usual because they will be send to France for me to correct.)

The slides of each lecture unit will be available here.

__Topics__

Unit 1: Introduction

Unit 2: The reinforcement learning problem

Unit 3: Dynamic programming

Unit 4: Monte Carlo methods

Unit 5: Temporal difference learning

Unit 6: Eligibility traces

Mini-Unit: Actor-critic methods

Unit 7: Function approximation

Unit 8: Planning and learning

Unit 9: Least-squares temporal difference learning (LSTD)

Unit 10: Policy gradient methods

Unit 11: Evolutionary reinforcement learning

Outlook: Other reinforcement learning areas

This slides and topics will be updated after each lecture.

Exercises due on 29.10.2010

Exercises due on 12.11.2010

Exercises due on 03.12.2010

Exercises due on 14.01.2011

Exercises due on 04.02.2011

The last exercises due on 04.02.2011 will be available at 28.01.2011.

Additional files: LinearPolicy.h (Aufgabe 10)

SinglePole.h (Aufgabe 10)

cartPoleCMA-ES.cpp (Aufgabe 10)

RLTask.h (Aufgabe 5, 6 und 10)

Maze.h (Aufgabe 5)

onPolicyMonteCarlo.cpp (Aufgabe 5) arrived later, but is available now. To compensate for the delay, the solutions for the programming exercises can be emailed until Friday evening (20.00).

Policy.h (Aufgabe 6)

dangerMaze.cpp (Aufgabe 6)

DangerousMaze.h (Aufgabe 6)

Makefile (for Linux, adapt the path to your local Shark installation!)

Makefile-Mac (for Mac, adapt the path to your local Shark installation and add the library path to the system variable DYLD_LIBRARY_PATH!)

Installation tips for Shark in Ubuntu:

Based on a fresh install of Ubuntu 10.10 the following packages are needed (install from "System->Systemadministration->Synoptic package manager"):
- binary install (.deb): **g++**

- source install: **cmake** and **g++**

V. Heidrich-Meisner, M. Lauer, C. Igel, and M. Riedmiller. Reinforcement Learning in a Nutshell. In M. Verleysen, editor, 15th European Symposium on Artificial Neural Networks (ESANN 2007), Belgium: d-side publications, pp. 277-288, 2007. A short overview on reinforcement learning.

J. Boyan. Technical Update: Least-Squares Temporal Difference Learning. Machine Learning 49(2–3), pp. 233–246, 2002

R. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy Gradient Methods for Reinforcement Learning with Function Approximation,

Verena Heidrich-Meisner

room: from 01.11.2010 NB 3/26

telefon: 0234-32 27974

email: verena.heidrich-meisner at rub.de