Spoken Language Identification Using Deep Learning.

Journal: Computational intelligence and neuroscience
Published Date:

Abstract

The process of detecting language from an audio clip by an unknown speaker, regardless of gender, manner of speaking, and distinct age speaker, is defined as spoken language identification (SLID). The considerable task is to recognize the features that can distinguish between languages clearly and efficiently. The model uses audio files and converts those files into spectrogram images. It applies the convolutional neural network (CNN) to bring out main attributes or features to detect output easily. The main objective is to detect languages out of English, French, Spanish, and German, Estonian, Tamil, Mandarin, Turkish, Chinese, Arabic, Hindi, Indonesian, Portuguese, Japanese, Latin, Dutch, Portuguese, Pushto, Romanian, Korean, Russian, Swedish, Tamil, Thai, and Urdu. An experiment was conducted on different audio files using the Kaggle dataset named spoken language identification. These audio files are comprised of utterances, each of them spanning over a fixed duration of 10 seconds. The whole dataset is split into training and test sets. Preparatory results give an overall accuracy of 98%. Extensive and accurate testing show an overall accuracy of 88%.

Authors

  • Gundeep Singh
    Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India.
  • Sahil Sharma
    Department of Biotechnology, Indian Institute of Technology Roorkee, Roorkee 247667, India.
  • Vijay Kumar
    Computer Science and Engineering Department, National Institute of Technology, Hamirpur, Himachal Pradesh, India.
  • Manjit Kaur
    Computer and Communication Engineering Department, School of Computing and Information Technology, Manipal University Jaipur, Jaipur, India. Manjit.kr@yahoo.com.
  • Mohammed Baz
    Department of Computer Engineering, College of Computer and Information Technology, Taif University, P.O. Box. 11099, Taif 21994, Saudi Arabia.
  • Mehedi Masud
    Department of Computer Science, Taif University, Taif 21944, Saudi Arabia.