Hyper-parameter optimisation on HPC – a comparative study

Abstract

Hyper-parameter optimization is a crucial task in numerous applications of numerical modelling techniques. Methods as diverse as classical particle simulations, continuum simulations and neural networks, require an appropriate choice of their hyper-parameters. While for classical simulations, calibration to measured data by numerical optimization has a long tradition, the hyper parameters of neural networks are often chosen by a mixture of grid search, random search and manual parameter tuning. In this study we aim at creating the expert tool “OmniOpt”, which allows to optimize the hyper-parameters of a wide range of problems, ranging from classical simulations to different kinds of neural networks. Thereby, the emphasis is on generality and flexibility for the user in terms of applications and the choice of hyper-parameters to be optimized. Moreover, the optimization procedure – which is usually a very time-consuming task – should be performed in a highly parallel way on the HPC system Taurus, available at TU Dresden. To this end, a Bayesian stochastic optimization algorithm (tree-structured Parzen estimator) has been implemented on the Taurus system and connected to a user-friendly GUI. The user has to supply his/her application as black box yielding a user-defined quality function together with an appropriate amount of training data. Most standard programming languages can be used: C, Python, R, to mention only some of the most popular ones. Around 40000 CPUs as well as several hundred GPUs are available on the HPC cluster. In addition to the automatic optimization service, there is a variety of tools for analyzing and graphically displaying the results of the optimization. The system has been tested on an application from material science using a convolutional neural network. Moreover, OmniOpt has been compared to automated standard hyper-parameter optimization packages such as AutoKeras and AutoML in terms of performance, accuracy and flexibility of the user.

Speaker

Peter Winkler, TU Dresden, Germany