A dataset for recognition of Arabic accents from spoken L2 English speech (ArL2Eng).

Journal: Scientific data
Published Date:

Abstract

This paper introduces the ArL2Eng dataset, a speech corpus of L2 English produced by native speakers of Arabic, and highlights its potential in supporting research into automated language assessment. ArL2Eng comprises audio sequences from speakers of various Arabic backgrounds uttering English sentences. It is appropriately labelled by native Arabic speakers, which facilitates research in accent recognition and speech processing applications. A large part of ArL2Eng (471 out of 640 records) of spoken samples, are annotated with fluency metrics from human expert raters. The dataset uses extracted features like Mel Frequency Cepstral Coefficients for phonetic and acoustic analysis. It is used to predict English fluency among speakers and learners of Arab accents, using advanced deep learning techniques, supported by dimensionality reduction. ArL2Eng is designed to support different applicative contexts, from multilingual speech recognition and accent classification to speaker identification. ArL2Eng provides a unique resource for both educators and researchers to design scalable and objective fluency evaluation models. The dataset is made public to boost the research in this field.

Authors

  • Manssour Habbash
    Applied College, University of Tabuk, Tabuk, 47512, Saudi Arabia. m_habbash@ut.edu.sa.
  • Sami Mnasri
    Applied College, University of Tabuk, Tabuk, 47512, Saudi Arabia. smnasri@ut.edu.sa.
  • Mansoor Alghamdi
    Department of Computer Science, Applied College, University of Tabuk, Tabuk, Saudi Arabia.
  • Malek Alrashidi
    Computer Science Department, Applied College, University of Tabuk, Tabuk, Saudi Arabia.
  • Ahmad S Tarawneh
    Department of Information Technology, Mutah University, Al-Karak, Jordan.
  • Abdullah Gumair
    College of Computer Science, University of Tabuk, Tabuk, 47512, Saudi Arabia.
  • Ahmad B Hassanat
    Department of Information Technology, Mutah University, Al-Karak, Jordan.