Workshop: Machine Learning on HPC Systems (MLHPCS)

View on GitHub

Machine learning for generic energy models of high performance computing resources

Abstract

Reducing energy consumption is crucial in nowadays high performance computing facilities, as it represents a large fraction of the operation costs and because of its environmental consequences due to its carbon footprint. In addition, the load limitations of the power grid introduce additional constraints on the available and required power. This has led most supercomputing facilities to enforce energy aware policies for load scheduling and execution. Energy aware policies have energy models as input, which usually consist in estimating the energy consumption entailed by a given workload on a specific computing resource. Related works have shown that accurate energy predictions can be computed by considering workload and resources characteristics. However, models trained for specific benchmarks executed on specific resources usually are inaccurate for estimating power consumption for applications and resources that differ from the training scenario. This article presents an approach to how resulting models of learning methods fit to workload and resources for which no trace exist in the training data.

Speaker

Jonathan Muraña, Universidad de la República