New website coming soon
Please note that over the next month our website will be changing. Go to cecs.anu.edu.au/newsite for more information.
Feature Markov Decision Process in Practice
Phuong Nguyen (ANU)CS HDR MONITORING AI Group
TIME: 12:00:00 - 12:30:00
LOCATION: RSISE Seminar Room, A105 with Pizza
Following a recent surge in using history based methods for resolving perceptual aliasing in reinforcement learning, we introduce an algorithm based on the recently proposed feature reinforcement learning framework. To create a practical algorithm we device a stochastic search procedure for a class of context trees that is based on parallel tempering and a specialized proposal distribution. In our empirical evaluation we achieve superior performance to the classical U-tree algorithm and the recent active-LZ algorithm and we are competitive with MC-AIXI-CTW that maintains a bayesian mixture over all context trees up to a chosen depth. We are encouraged by our ability to compete with this sophisticated method using an algorithm that simply picks one tree and use Q-learning on the corresponding MDP. We believe this shows promise for solving larger scale problems.