Learning policy scheduling for text augmentation.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

When training deep learning models, data augmentation is an important technique to improve the performance and alleviate overfitting. In natural language processing (NLP), existing augmentation methods often use fixed strategies. However, it might be preferred to use different augmentation policies in different stage of training, and different datasets may require different augmentation policies. In this paper, we take dynamic policy scheduling into consideration. We design a search space over augmentation policies by integrating several common augmentation operations. Then, we adopt a population based training method to search the best augmentation schedule. We conduct extensive experiments on five text classification and two machine translation tasks. The results show that the optimized dynamic augmentation schedules achieve significant improvements against previous methods.

Authors

  • Shuokai Li
    Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, China; University of Chinese Academy of Sciences, Beijing, China; Institute of Intelligent Computing Technology, Suzhou, China. Electronic address: lishuokai18z@ict.ac.cn.
  • Xiang Ao
    Techlex Food Co., Ltd., Chengdu, China.
  • Feiyang Pan
    Central Clinical School, The University of Sydney, Sydney, NSW, 2050, Australia.
  • Qing He