CMPUT 653 - Theoretical Foundations of Reinforcement Learning

Overview

The purpose of this course is to allow students to acquire a solid understanding of the theoretical foundations of reinforcement learning (and get to learn about what “doing theory” really means in the context of computer science). The topics will range from building up foundations (Markovian Decision Processes and the various special cases of it), to discussing solutions to the three core problem settings:

planning/simulation optimization
batch reinforcement learning, and
online reinforcement learning

In each of these settings, we cover key algorithmic challenges and the core ideas to address these. Specific topics, ideas and algorithms covered include, for each topic:

complexity of planning/simulation optimization; large scale planning with function approximation; sample complexity of batch learning with and without function approximation;
efficient online learning: the role (and limits) of optimism; scaling up with function approximation. While we will explore connection to (some) deep RL methods, mainly seeking an answer to the question of when can we expect them to work well, the course will not focus on deep RL.

Objectives

After this course, you should be able to place new results in the context of the existing theoretical results. You should also be able to ask new questions and sometimes answer those using the tools and techniques you learned.

Course Work

There will be 4 assignments, each worth 10% of the mark.

There will be a take-home midterm, worth 20% of the mark.

There will be a project for the remaining 100-60=40% of the mark.

Each assignment will have about 2-5 problems (however, some might have more, some might have less). The problems might ask you to (1) design an algorithm, (2) prove something, (3) implement and test some algorithm. The solutions must be submitted electronically. Reports must be typed up (latex template will be provided) and submitted through eclass.

In addition, there may be some (unmarked) reading assignments. Some of the problems might be related to the reading assignments, but the main purpose of the reading assignments is to get you up to speed about a topic before the class where we need the particular topic.

Students can work on their assignments and projects alone or in pairs of two. However, you have to produce the final version of your solution on your own. If you worked with someone, or got help from someone, you have to note this on your solution, giving the name of the person(s). You are strongly advised to attempt the problems on your own: The only way to properly learn the material is by "fighting your way” through these problems.

The midterm has to be completed alone.

All assignments are due 2 weeks after their release dates (14*24 hours after the release date). Assignments will be released approximately bi-weekly.

About

Research

Undergraduate

Graduate

Resources

CMPUT 653 - Theoretical Foundations of Reinforcement Learning

Overview

Objectives

Course Work

Course Details