Recognizing Emotional States Using Speech Information.

Journal: Advances in experimental medicine and biology
Published Date:

Abstract

Emotion recognition plays an important role in several applications, such as human computer interaction and understanding affective state of users in certain tasks, e.g., within a learning process, monitoring of elderly, interactive entertainment etc. It may be based upon several modalities, e.g., by analyzing facial expressions and/or speech, using electroencephalograms, electrocardiograms etc. In certain applications the only available modality is the user's (speaker's) voice. In this paper we aim to analyze speakers' emotions based solely on paralinguistic information, i.e., not depending on the linguistic aspect of speech. We compare two machine learning approaches, namely a Convolutional Neural Network and a Support Vector Machine. The former is trained using raw speech information, while the latter is trained on a set of extracted low-level features. Aiming to provide a multilingual approach, training and testing datasets contain speech from different languages.

Authors

  • Michalis Papakostas
    Computer Science and Engineering Department, University of Texas at Arlington, Arlington, TX, USA.
  • Giorgos Siantikos
    Institute of Informatics and Telecommunications, National Center for Scientific Research-"Demokritos", Athens, Greece.
  • Theodoros Giannakopoulos
    Institute of Informatics and Telecommunications, National Center for Scientific Research-"Demokritos", Athens, Greece.
  • Evaggelos Spyrou
    Institute of Informatics and Telecommunications, National Center for Scientific Research-"Demokritos", Athens, Greece. espyrou@iit.demokritos.gr.
  • Dimitris Sgouropoulos
    Institute of Informatics and Telecommunications, National Center for Scientific Research-"Demokritos", Athens, Greece.