Machine Learning Hyper-parameter Optimisation using Semi-random Sequences

Description

The accuracy and generalisability of a machine learning model is determined by the right choice of hyper-parameters.  All models have hyper-parameters (even non-parametric models) and the wrong choice will result in a poor fit to the data, and useless predictions.  Finding the optimal set of hyper-parameter for any model is always the most time consuming part of machine learning, and is usually done by extensive (and sometimes exhaustive) grid searches, or highly variable random selection.  An alternative is to use deterministic sequences, which are a type of quasi-random sequence that is "less random" than a pseudorandom number sequence.  These sequences can be more useful for the approximation of models in higher dimensions, and in global optimization, because low discrepancy sequences tend to sample space "more uniformly" than random numbers. In this project you will implement some semi-random sequences for hyper-parameter optimization of machine learning models in python, test the performance of different sequences, and compare the result (accuracy and computational efficiency) to random and grid searching methods. Data sets will be provided.

Goals

To produce a python module for a library for general use by machine learning researchers, and a scientific publication.

Requirements

Python programming and an interest in data science and machine learning is essential.

Please double check my email when enquiring to ensure your message finds the right professor.

Keywords

machine learning, optimisation, data science, python

Updated:  1 June 2019/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing