Feature Markov Decision Process in Practice

Phuong Nguyen (ANU)


DATE: 2011-05-04
TIME: 12:00:00 - 12:30:00
LOCATION: RSISE Seminar Room, A105 with Pizza
CONTACT: JavaScript must be enabled to display this email address.

Following a recent surge in using history based methods for resolving perceptual aliasing in reinforcement learning, we introduce an algorithm based on the recently proposed feature reinforcement learning framework. To create a practical algorithm we device a stochastic search procedure for a class of context trees that is based on parallel tempering and a specialized proposal distribution. In our empirical evaluation we achieve superior performance to the classical U-tree algorithm and the recent active-LZ algorithm and we are competitive with MC-AIXI-CTW that maintains a bayesian mixture over all context trees up to a chosen depth. We are encouraged by our ability to compete with this sophisticated method using an algorithm that simply picks one tree and use Q-learning on the corresponding MDP. We believe this shows promise for solving larger scale problems.


Updated:  3 May 2011 / Responsible Officer:  JavaScript must be enabled to display this email address. / Page Contact:  JavaScript must be enabled to display this email address. / Powered by: Snorkel 1.4