Lossless data compression allows exact recovery of the original data from a compressed file [CT06, Chp.5]. Its most important uses are for saving storage space and channel bandwidth, and also as components in lossy compressors. Continuous theoretical advances and increase in computation power result in a steady improvement of compression algorithms [Mah12, Hut06]. Improvements of a few percent translate into million-dollar savings in mobile and broadband networks. Many compressors use arithmetic coding based on a statistical model of the data which itself is learnt from the data. A popular such method is Context Tree Weighting (CTW) [WST95, WST97]. Extensions thereof are among the best existing compressors. They also come with strong theoretical guarantees [OHSS12, VH12, VNHB12, VWBG13]. The learnt statistical models are actually ideal sequential predictors, and can also be used in intelligent agents [VNH+11].
- improve or combine existing or develop new Extensions of CTW (ETCW)
- prove performance bounds for ECTW compression
- experimentally test ECTW and verify the bounds
- beat state-of-the-art compressors
- excellent general math skills
- mastering elementary probability calculus and statistics
- ideally a background in information theory
- reliable handling of long formulas
- creativity in finding and constructing proofs
- basic programming skills
- ideally experience with implementing numerical algorithms
- [CT06] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley- Intersience, 2nd edition, 2006.
- [Hut06] M. Hutter. Human knowledge compression prize, 2006. open ended, http://prize.hutter1.net/.
- [Kul11] M. O. Kulekci. Compressed context modeling for text compression. In Data Compression Conference (DCC 2011), pages 373–382, Snowbird, UT, USA, March 2011. IEEE Computer Society.
- [Mah12] M. Mahoney. Data Compression Explained. Dell, Inc, http://mattmahoney.net/dc/dce.html, 2012.
- [OHSS12] Alexander O’Neill, Marcus Hutter, Wen Shao, and Peter Sunehag. Adaptive context tree weighting. In Proc. Data Compression Conference (DCC’12), pages 317–326, Snowbird, Utah, 2012. IEEE Computer Society.
- [Hut13] M. Hutter. Sparse adaptive Dirichlet-multinomial-like processes. Journal of Machine Learning Research, W&CP: COLT, 30:432--459, 2013.
- [VNHUS11] J. Veness, K. S. Ng, M. Hutter, W. Uther, and D. Silver. A Monte Carlo AIXI approximation. Journal of Artificial Intelligence Research, 40:95–142, 2011.
- [VNHB] J. Veness, K. S. Ng, M. Hutter, and M. Bowling. Context tree switching. In Proc. Data Compression Conference (DCC’12), pages 327–336, Snowbird, Utah. IEEE Computer Society.
- [VSH12] J. Veness, P. Sunehag, and M. Hutter. On ensemble techniques for AIXI approximation. In Proc. 5th Conf. on Artificial General Intelligence (AGI’12), volume 7716, pages 341–351, Springer, Berlin, 2012.
- [V+12] J. Veness et al. Partition tree weighting. In Proc. Data Compression Conference (DCC'13), pages 321-330, Snowbird, Utah, USA, 2013. IEEE Computer Society.
- [WST95] F. M. J. Willems, Y. M. Shtarkov, and T. J. Tjalkens. The context tree weighting method: Basic properties. IEEE Transactions on Information Theory, 41:653–664, 1995.
- [WST97] F. M. J. Willems, Y. M. Shtarkov, and T. J. Tjalkens. Reflections on the prize paper: The context-tree weighting method: Basic properties. IEEE Information Theory Society Newsletter, pages 20–27, 1997.
- getting a deep understanding of state-of-the-art compression algorithms
- acquiring experience in proving theorems
- writing a small workshop or conference paper
- improving your math skills
- ideal preparation for the Human Knowledge Compression Contest [Hut06]
compression; information; arithmetic coding; context tree; probability; statistics; theory; bounds; prediction; model.