Non-differentiable saddle points and sub-optimal local minima exist for deep ReLU networks.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Whether sub-optimal local minima and saddle points exist in the highly non-convex loss landscape of deep neural networks has a great impact on the performance of optimization algorithms. Theoretically, we study in this paper the existence of non-differentiable sub-optimal local minima and saddle points for deep ReLU networks with arbitrary depth. We prove that there always exist non-differentiable saddle points in the loss surface of deep ReLU networks with squared loss or cross-entropy loss under reasonable assumptions. We also prove that deep ReLU networks with cross-entropy loss will have non-differentiable sub-optimal local minima if some outermost samples do not belong to a certain class. Experimental results on real and synthetic datasets verify our theoretical findings.

Authors

  • Bo Liu
    Wuhan United Imaging Healthcare Surgical Technology Co., Ltd., Wuhan, China.
  • Zhaoying Liu
    College of Computer Science, Faculty of Information Technology, Beijing University of Technology, Beijing, China. Electronic address: zhaoying.liu@bjut.edu.cn.
  • Ting Zhang
    Beijing Municipal Key Laboratory of Child Development and Nutriomics, Capital Institute of Pediatrics, Beijing 100020, China.
  • Tongtong Yuan
    College of Computer Science, Faculty of Information Technology, Beijing University of Technology, Beijing, China. Electronic address: yuantt@bjut.edu.cn.