Nonparametric General Reinforcement Learning (public PhD "defense")

Reinforcement learning problems are often phrased in term of
Markov decision processes (MDPs). In this talk, we go beyond MDPs and
consider reinforcement learning in environments that are non-Markovian,
non-ergodic and only partially observable. My focus will not be on
practical algorithms, but rather on the fundamental underlying problems.
I introduce the Bayesian agent AIXI, point out some of its problems,
and discuss potential solutions. In particular I consider the multi-agent setup.

Date & time

11.30am–12.30pm 30 March 2016

Location

Internal speakers

Mr Jan Leike

Contacts

Updated:  8 September 2015/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing