Analyzing Machine Learning Workloads on Contemporary Processors


Machine Learning workloads are becoming increasingly more prevalent and compute-intensive. They are run on standard multicore processors and accelerators such as GPUs, as well as custom or semi-custom devices such as  Tensor Processing Unnits and  Qualcomms  Snapdragon DSP core.

This project will involve the benchmarking and performance analysis of various ML, with an emphasis on Deep Learning, workloads, on a selection of processors, including standard x86-64 processors, GPUs and custom devices. The goals will be to find what are the dominant functions in the workloads and their characteristics (e.g. memory vs compute intensity), and to evaluate the effectiveness of the different classes of architectures on processing them.


Fluency in the Linux environment and working with complex software packages is needed. Knowledge of basic computer organization and program execution is required. Some knowledge of ML techniques and advanced process architecture is desirabl

Background Literature

Brandon Reagen,et al, Deep Learning for Computer Architects, Morgan & Claypool, 2017,

Deep Learning Benchmark,

AI accelerator,


An imperoved knowledge and working experience of Deep Learning mehtods, algorithms and software,  plus ithat of contemporary procesors. Skills in the use of perofmrance evaluation including tools will also be acquired.


Machine Learning, Deep Learning, Accelerators, Performance Evaluation

Updated:  10 February 2019/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing