Reinforcement Learning and Artificial Intelligence (RLAI) Winter 2018 University of Alberta

CMPUT 607: Applied Reinforcement Learning

Instructor: Patrick M. Pilarski (pilarski@ualberta.ca) (http://www.ualberta.ca/~pilarski)
Office: 5-020F Katz Group Centre. Office Hours: After class, by appointment.

Class Time: Thursday, 11:00am-2:00pm. Location TBA. First class is Thursday, January 11th.

Description: Reinforcement learning is a powerful machine learning approach that can be used to address challenges in medicine, science, engineering, and industry. This course will give students practical experience applying reinforcement learning techniques in a real-world setting. From the first week of class onwards, students will begin working with a simple robot that will form the basis for their course work and the platform for their final project. Over the term, students will gain a hands-on understanding of reinforcement learning methods for prediction and control, including general value functions, policy gradient methods, and actor-critic reinforcement learning. They will also develop a high-level appreciation for advanced topics such as predictive representations of state, off-policy learning with function approximation, and human-delivered reward. Course work will be highly interactive, and students will refine their empirical machine learning skills through the design, execution, analysis, and clear communication of reinforcement learning experiments. At the culmination of the course, students will each engage in a self-selected final project that showcases their comprehensive understanding of the topics covered in the class.

Prerequisites: Students should ideally have completed an initial course in reinforcement learning (e.g., CMPUT 366 or CMPUT 609) and have worked with and implemented the following reinforcement learning ideas:

Temporal-difference learning, TD(lambda)
Function approximation (tile coding)
Eligibility traces
SARSA(lambda) and Q(lambda)

Students should also have skills in computer programming (e.g., Python, C, and/or C++) and basic probability theory, and be comfortable with statistical ideas such as probability distributions and expected values. Familiarity with linear algebra would be helpful but is not required. Due to a limited class size, open registration is closed (class roster is currently full), and all students should be using or intend to use reinforcement learning in their thesis research. If there are questions about required background, please contact the instructor prior to registration.

Course Materials: This course will focus on the implementation of reinforcement learning techniques in a very basic robotic environment. As such, a simple robot actuator and its interface hardware will be provided to each student to use for the duration of the course. Should students wish to keep their robot after the completion of the course, they may purchase their own robotic hardware. This will be discussed further in class, and students do not need to pre-buy materials. Basic plastic robot parts will be provided, and students will be given the opportunity to design and 3D-print additional parts for their robot and/or experimental setup during the course. Students are expected to bring their own laptop capable of running Python (ideally, a Linux system that supports the Robot Operating System). Main course materials include:

Robot actuator(s), interface hardware, and power supply (course provided).
Laboratory notebook (student provided)
Laptop computer (student provided)

Other Materials: Most study materials and peer-reviewed scholarly articles will be distributed via the course dropbox folder. The course will also reference other freely available resources including:

Reinforcement Learning: An Introduction, second edition in progress, by Richard S. Sutton and Andrew G. Barto (MIT Press, 1998; new version distributed in class).
Developing a Predictive Approach to Knowledge, by Adam White (Ph.D. Thesis, University of Alberta, 2015).
Algorithms for Reinforcement Learning, by Csaba Szepesvari (Morgan and Claypool, 2010).
Robot Operating System documentation (http://wiki.ros.org/)

Course assignments and projects:

Written Exercises: Students will be expected to prepare a brief literature survey each week on the selected readings assigned in class.

Programming and Robotics Modules: In parallel with their review of the literature, students will complete and report on a series of programming projects involving the robotic deployment of reinforcement learning techniques they have studied in class.

Proposal, Presentation, and Peer Critiques: At the onset of their final project, students will prepare a project proposal and present it to the class. Following these presentations, students will prepare and submit constructive peer-critiques to help each other succeed in their final project work.

Final Project and Report: During the last month of the course, students will undertake an experimental project of their own design using their robot platform. This project will enable students to explore in detail one of the topics covered in class, and provide them with an opportunity to showcase their empirical machine learning and technical communication skills by way of a final term paper.

Grading will be on the basis of (with relative weighting):

20% Written literature surveys
20% Programming and robotics modules
10% Project proposal
5% In-class presentation
5% Written peer critiques of other presentations
40% Final written project report and project demonstration

Academic Integrity: The University of Alberta is committed to the highest standards of academic integrity and honesty. Students are expected to be familiar with these standards regarding academic honesty and to uphold the policies of the University in this respect. Students are particularly urged to familiarize themselves with the provisions of the Code of Student Behaviour and avoid any behaviour which could potentially result in suspicions of cheating, plagiarism, misrepresentation of facts and/or participation in an offence. Academic dishonesty is a serious offence and can result in suspension or expulsion from the University.