Rethinking cell-based neural architecture search: A theoretical perspective.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

In this paper, we explore several fundamental theoretical issues in cell-based neural architecture search, including whether different architectures in search space are equally important in terms of the minimal training loss they can achieve, and whether an optimal cell found by architecture search is still optimal when using it in a cell stack. We model the directed acyclic graph (DAG) of a cell as a stem path plus multiple bypass edges. For single cells, we prove that the training loss is always decreased or at least maintained if skip connections are added into bypass edges in a greedy order and learnable operations are added into bypass edges in any order. We also prove that for some architectures with special weights, a learnable operation is worse than a skip connection in the sense that when adding them separately in the same bypass, a learnable operation will lead to a architecture with higher or equal training loss. Then, when stacking multiple identical cells during architecture evaluation, we prove that an optimal cell structure formed by adding bypass edges greedily will yield an optimal cell stack. These theoretical results are verified with experimental results on NAS-BENCH-201 dataset. Finally, when additional metrics such as network size are taken into account, we design examples to demonstrate that an optimal cell obtained by architecture search may be not optimal again in cell stack, and allowing non-tied cell structures in a cell stack may produce better result. These theoretical results increase our understanding of search space and partly justify the cell-based architecture search paradigm.

Authors

  • Bo Liu
    Wuhan United Imaging Healthcare Surgical Technology Co., Ltd., Wuhan, China.
  • Huiwen Zhao
    College of Computer Science, Beijing University of Technology, Beijing, China. Electronic address: ZhaoHW@emails.bjut.edu.cn.
  • Tongtong Yuan
    College of Computer Science, Faculty of Information Technology, Beijing University of Technology, Beijing, China. Electronic address: yuantt@bjut.edu.cn.
  • Ting Zhang
    Beijing Municipal Key Laboratory of Child Development and Nutriomics, Capital Institute of Pediatrics, Beijing 100020, China.
  • Zhaoying Liu
    College of Computer Science, Faculty of Information Technology, Beijing University of Technology, Beijing, China. Electronic address: zhaoying.liu@bjut.edu.cn.