Removing Unwanted Variation in Machine Learning for Personalized Medicine

Terry Speed
Machine Learning for Personalized Medicine will inevitably build on large omics datasets. These are often collected over months or years, and sometimes involve multiple labs. Unwanted variation (UV) can arise from technical elements such as batches, different platforms or laboratories, or from biological signals such sheterogeneity in age, ethnicity or cellular composition, which are unrelated to the factor of interest in the study. Similar issues arise when the goal is to combine several smaller studies. A very important task is to remove these UV factors without losing the factors of interest. Some years ago we proposed a general
ramework (called RUV) for removing UV in microarray data using negative control genes. It showed very good behavior for differential expression analysis (i.e., with a known factor of interest) when applied to several datasets. Our objective in this talk is to describe our recent results doing similar things in a machine
learning context, specifically when carrying out classification.


Terry Speed completed a BSc (Hons) in mathematics and statistics at the University of Melbourne and a PhD in mathematics and Dip Ed at Monash University. He has held appointments at the University of Sheffield, U.K., the University of Western Australia in Perth, and the University of California at Berkeley, and with the CSIRO in Canberra. In 1997 he took up an appointment with the Walter & Eliza Hall Institute of Medical Research, where he is now an Honorary Fellow and lab head in the Bioinformatics Division. His research interests lie in the application of statistics and bioinformatics to genetics and genomics, and related fields such as proteomics, metabolomics and epigenomics, with a focus on cancer and epigenetics.

Date & time

5.30–7pm 6 Jun 2016


Room:Engineering Design Studio


Professor Terry Speed


Updated:  1 June 2019/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing