|
|
|
Help | Seminars List | Add Seminar | Edit Seminars | Tips for organisers | RSS | ics Calendar | Search | Send comments about this website to seminar-master@cecs.anu.edu.au
Contact: Michelle.Moravec@anu.edu.au NICTA SEMINAR
A Stochastic Memoizer for Sequence DataDr Frank Wood (Gatsby Unit, University College London )DATE: 2009-08-17 TIME: 11:00:00 - 12:00:00 LOCATION: NICTA - 7 London Circuit ABSTRACT: Note: Free Pizza Seminar. All welcome. I will present a model for discrete sequence data called the sequence memoizer (SM). The sequence memoizer is a smoothing Markov model of unbounded order (think n-gram as n->infinity) that empirically has been shown to have the same computational complexity as a fifth order smoothing Markov model. The SM can be estimated from a single training sequence, yet shares statistical strength between subsequent symbol predictive distributions in such a way that predictive performance generalizes well. The model builds on a specific parameterization of an unbounded-depth hierarchical Pitman-Yor process. I will introduce analytic marginalization steps (using coagulation operators) to reduce this model to one that can be represented in time and space linear in the length of the training sequence. I show how to perform inference in such a model without truncation approximation and introduce fragmentation operators necessary to do predictive inference. I will demonstrate the SM as a language model, achieving state-of-the-art results.
|