Nonparametric General Reinforcement Learning (public PhD "defense")

Reinforcement learning problems are often phrased in term of
Markov decision processes (MDPs). In this talk, we go beyond MDPs and
consider reinforcement learning in environments that are non-Markovian,
non-ergodic and only partially observable. My focus will not be on
practical algorithms, but rather on the fundamental underlying problems.
I introduce the Bayesian agent AIXI, point out some of its problems,
and discuss potential solutions. In particular I consider the multi-agent setup.

Date & time

11.30am–12.30pm 30 Mar 2016


Internal speakers

Mr Jan Leike


Updated:  1 November 2018/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing