Artificial intelligence enabled smart mask for speech recognition for future hearing devices.

Journal: Scientific reports
PMID:

Abstract

In recent years, Lip-reading has emerged as a significant research challenge. The aim is to recognise speech by analysing Lip movements. The majority of Lip-reading technologies are based on cameras and wearable devices. However, these technologies have well-known occlusion and ambient lighting limitations, privacy concerns as well as wearable device discomfort for subjects and disturb their daily routines. Furthermore, in the era of coronavirus (COVID-19), where face masks are the norm, vision-based and wearable-based technologies for hearing aids are ineffective. To address the fundamental limitations of camera-based and wearable-based systems, this paper proposes a Radio Frequency Identification (RFID)-based smart mask for a Lip-reading framework capable of reading Lips under face masks, enabling effective speech recognition and fostering conversational accessibility for individuals with hearing impairment. The system uses RFID technology to make Radio Frequency (RF) sensing-based Lip-reading possible. A smart RFID face mask is used to collect a dataset containing three different classes of vowels (A, E, I, O, U), Consonants (F, G, M, S), and words (Fish, Goat, Meal, Moon, Snake). The collected data are fed into well-known machine-learning models for classification. A high classification accuracy is achieved by individual classes and combined datasets. On the RFID combined dataset, the Random Forest model achieves a high classification accuracy of 80%.

Authors

  • Hira Hameed
  • Lubna
    James Watt School of Engineering, University of Glasgow, Glasgow, G12 8QQ, UK.
  • Muhammad Usman
    Shaheed Zulfikar Ali Bhutto Institute of Science and Technology, Islamabad, Pakistan.
  • Jalil Ur Rehman Kazim
    James Watt School of Engineering, University of Glasgow, Glasgow, G12 8QQ, UK.
  • Khaled Assaleh
    Department of Electrical and Computer Engineering, College of Engineering and Information Technology, Ajman University, Ajman, UAE.
  • Kamran Arshad
    College of Engineering and IT, Ajman University, Ajman 20550, United Arab Emirates.
  • Amir Hussain
    Cognitive Signal-Image and Control Processing Research Laboratory, School of Natural Sciences, University of Stirling, Stirling, FK9 4LA, United Kingdom.
  • Muhammad Imran
    Institute of Biochemistry and Biotechnology, University of Veterinary and Animal Sciences, 54000 Lahore, Pakistan.
  • Qammer H Abbasi
    James Watt School of EngineeringUniversity of Glasgow Glasgow G12 8QQ U.K.