Machine Learning workloads are becoming increasingly more prevalent and compute-intensive. They are run on standard multicore processors and accelerators such as GPUs, as well as custom or semi-custom devices such as Tensor Processing Unnits and Qualcomms Snapdragon DSP core.
This project will involve the benchmarking and performance analysis of various ML, with an emphasis on Deep Learning, workloads, on a selection of processors, including standard x86-64 processors, GPUs and custom devices. The goals will be to find what are the dominant functions in the workloads and their characteristics (e.g. memory vs compute intensity), and to evaluate the effectiveness of the different classes of architectures on processing them.
Fluency in the Linux environment and working with complex software packages is needed. Knowledge of basic computer organization and program execution is required. Some knowledge of ML techniques and advanced process architecture is desirabl
Brandon Reagen,et al, Deep Learning for Computer Architects, Morgan & Claypool, 2017, http://www.morganclaypoolpublishers.com/catalog_Or...
Deep Learning Benchmark, https://github.com/u39kun/deep-learning-benchmark
AI accelerator, https://en.wikipedia.org/wiki/AI_accelerator
An imperoved knowledge and working experience of Deep Learning mehtods, algorithms and software, plus ithat of contemporary procesors. Skills in the use of perofmrance evaluation including tools will also be acquired.
Machine Learning, Deep Learning, Accelerators, Performance Evaluation