Architecture-Aware Bayesian Optimization for Neural Network Tuning
Paper i proceeding, 2019

Hyperparameter optimization of a neural network is a nontrivial task. It is time-consuming to evaluate a hyperparameter setting, no analytical expression of the impact of the hyperparameters are available, and the evaluations are noisy in the sense that the result is dependent on the training process and weight initialization. Bayesian optimization is a powerful tool to handle these problems. However, hyperparameter optimization of neural networks poses additional challenges, since the hyperparameters can be integer-valued, categorical, and/or conditional, whereas Bayesian optimization often assumes variables to be real-valued. In this paper we present an architecture-aware transformation of neural networks applied in the kernel of a Gaussian process to boost the performance of hyperparameter optimization. The empirical experiment in this paper demonstrates that by introducing an architecture-aware transformation of the kernel, the performance of the Bayesian optimizer shows a clear improvement over a naive implementation and that the results are comparable to other state-of-the-art methods.

Hyperparameter optimization

Transformation

Neural networks

Gaussian process

Författare

Anders Sjöberg

Stiftelsen Fraunhofer-Chalmers Centrum för Industrimatematik

Magnus Önnheim

Chalmers, Matematiska vetenskaper, Algebra och geometri

Emil Gustavsson

Chalmers, Matematiska vetenskaper

Mats Jirstrand

Chalmers, Elektroteknik, System- och reglerteknik

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 11728 220-231
978-3-030-30484-3 (ISBN)

28th International Conference on Artificial Neural Networks (ICANN)
Munich, Germany,

Ämneskategorier

Datorteknik

Datavetenskap (datalogi)

Datorsystem

DOI

10.1007/978-3-030-30484-3_19

Mer information

Senast uppdaterat

2024-08-28