Magnitude and angle dynamics in training single ReLU neurons.

Journal: Neural networks : the official journal of the International Neural Network Society

Published Date: Jun 22, 2024

Abstract

Understanding the training dynamics of deep ReLU networks is a significant area of interest in deep learning. However, there remains a lack of complete elucidation regarding the weight vector dynamics, even for single ReLU neurons. To bridge this gap, our study delves into the training dynamics of the gradient flow w(t) for single ReLU neurons under the square loss, dissecting it into its magnitude ‖w(t)‖ and angle φ(t) components. Through this decomposition, we establish upper and lower bounds on these components to elucidate the convergence dynamics. Furthermore, we demonstrate the empirical extension of our findings to general two-layer multi-neuron networks. All theoretical results are generalized to the gradient descent method and rigorously verified through experiments.

Authors

Sangmin Lee

Department of Electronic Engineering, Inha University, Incheon 22212, Korea.
Byeongsu Sim

Department of Mathematical Sciences, KAIST, Daejeon, Republic of Korea.
Jong Chul Ye

Keywords

Algorithms Deep Learning Models, Neurological Neural Networks, Computer Neurons

External Resources

View on PubMed Access via DOI PubMed (38970945)

Magnitude and angle dynamics in training single ReLU neurons.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals