Efficient learning rate adaptation based on hierarchical optimization approach.

Journal: Neural networks : the official journal of the International Neural Network Society

Published Date: Feb 25, 2022

Abstract

This paper proposes a new hierarchical approach to learning rate adaptation in gradient methods, called learning rate optimization (LRO). LRO formulates the learning rate adaption problem as a hierarchical optimization problem that minimizes the loss function with respect to the learning rate for current model parameters and gradients. Then, LRO optimizes the learning rate based on the alternating direction method of multipliers (ADMM). In the process of this learning rate optimization, LRO does not require any second-order information and probabilistic model, so it is highly efficient. Furthermore, LRO does not require any additional hyperparameters when compared to the vanilla gradient method with the simple exponential learning rate decay. In the experiments, we integrated LRO with vanilla SGD and Adam. Then, we compared their optimization performance with the state-of-the-art learning rate adaptation methods and also the most commonly-used adaptive gradient methods. The SGD and Adam with LRO outperformed all the competitors on the benchmark datasets in image classification tasks.

Authors

Gyoung S Na

Chemical Data-Driven Research Center, Korea Research Institute of Chemical Technology (KRICT), Daejeon 34114, Korea.

Keywords

Algorithms Learning Models, Statistical Neural Networks, Computer

External Resources

View on PubMed Access via DOI PubMed (35339869)

Efficient learning rate adaptation based on hierarchical optimization approach.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals