MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning.

Journal: Computational intelligence and neuroscience
Published Date:

Abstract

Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications. However, ANNs are notably slow in computation especially when the size of data is large. Nowadays, big data has received a momentum from both industry and academia. To fulfill the potentials of ANNs for big data applications, the computation process must be speeded up. For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications. Three data intensive scenarios are considered in the parallelization process in terms of the volume of classification data, the size of the training data, and the number of neurons in the neural network. The performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation.

Authors

  • Yang Liu
    Department of Computer Science, Hong Kong Baptist University, Hong Kong, China.
  • Jie Yang
    Key Laboratory of Development and Maternal and Child Diseases of Sichuan Province, Department of Pediatrics, Sichuan University, Chengdu, China.
  • Yuan Huang
    School of Electrical Engineering and Information, Sichuan University, Chengdu 610065, China.
  • Lixiong Xu
    School of Electrical Engineering and Information, Sichuan University, Chengdu 610065, China.
  • Siguang Li
    The Key Laboratory of Embedded Systems and Service Computing, Tongji University, Shanghai 200092, China.
  • Man Qi
    Department of Computing, Canterbury Christ Church University, Canterbury, Kent CT1 1QU, UK.