Architecture-Aware Bayesian Optimization for Neural Network Tuning
Paper in proceeding, 2019

Hyperparameter optimization of a neural network is a nontrivial task. It is time-consuming to evaluate a hyperparameter setting, no analytical expression of the impact of the hyperparameters are available, and the evaluations are noisy in the sense that the result is dependent on the training process and weight initialization. Bayesian optimization is a powerful tool to handle these problems. However, hyperparameter optimization of neural networks poses additional challenges, since the hyperparameters can be integer-valued, categorical, and/or conditional, whereas Bayesian optimization often assumes variables to be real-valued. In this paper we present an architecture-aware transformation of neural networks applied in the kernel of a Gaussian process to boost the performance of hyperparameter optimization. The empirical experiment in this paper demonstrates that by introducing an architecture-aware transformation of the kernel, the performance of the Bayesian optimizer shows a clear improvement over a naive implementation and that the results are comparable to other state-of-the-art methods.

Hyperparameter optimization

Transformation

Neural networks

Gaussian process

Author

Anders Sjöberg

Fraunhofer-Chalmers Centre

Magnus Önnheim

Chalmers, Mathematical Sciences, Algebra and geometry

Emil Gustavsson

Chalmers, Mathematical Sciences

Mats Jirstrand

Chalmers, Electrical Engineering, Systems and control

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 11728 220-231
978-3-030-30484-3 (ISBN)

28th International Conference on Artificial Neural Networks (ICANN)
Munich, Germany,

Subject Categories

Computer Engineering

Computer Science

Computer Systems

DOI

10.1007/978-3-030-30484-3_19

More information

Latest update

8/28/2024