Enhancing speaker identification through reverberation modeling and cancelable techniques using ANNs.

Journal: PloS one
PMID:

Abstract

This paper introduces a method aiming at enhancing the efficacy of speaker identification systems within challenging acoustic environments characterized by noise and reverberation. The methodology encompasses the utilization of diverse feature extraction techniques, including Mel-Frequency Cepstral Coefficients (MFCCs) and discrete transforms, such as Discrete Cosine Transform (DCT), Discrete Sine Transform (DST), and Discrete Wavelet Transform (DWT). Additionally, an Artificial Neural Network (ANN) serves as the classifier for this method. Reverberation is modeled using varying-length comb filters, and its impact on pitch frequency estimation is explored via the Auto Correlation Function (ACF). This paper also contributes to the field of cancelable speaker identification in both open and reverberation environments. The proposed method depends on comb filtering at the feature level, deliberately distorting MFCCs. This distortion, incorporated within a cancelable framework, serves to obscure speaker identities, rendering the system resilient to potential intruders. Three systems are presented in this work; a reverberation-affected speaker identification system, a system depending on cancelable features through comb filtering, and a novel cancelable speaker identification system within reverbration environments. The findings revealed that, in both scenarios with and without reverberation effects, the DWT-based features exhibited superior performance within the speaker identification system. Conversely, within the cancelable speaker identification system, the DCT-based features represent the top-performing choice.

Authors

  • Emad S Hassan
    Department of Sports Health Science, College of Physical Education, Assiut University, Assiut, Egypt - e.hassan@aun.edu.eg.
  • Badawi Neyazi
    Productivity and Vocational Training Department, Ministry of Industry, Cairo, Egypt.
  • H S Seddeq
    Acoustic Laboratory, Housing and Building National Research Center, Giza, Egypt.
  • Adel Zaghloul Mahmoud
    Electronics and Communications Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt.
  • Ahmed S Oshaba
    Department of Electrical Engineering, College of Engineering, Jazan University, Jizan, Saudi Arabia.
  • Atef El-Emary
    Department of Electrical Engineering, College of Engineering, Jazan University, Jizan, Saudi Arabia.
  • Fathi E Abd El-Samie
    Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufa University, Menouf 32952, Egypt.