Machine Learning (ML) algorithms within a human–computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to explore the feasibility and characteristics of a cross-linguistic, cross-gender SER. Three ML classifiers (SVM, Naïve Bayes and MLP) are applied to acoustic features, obtained through a procedure based on Kononenko’s discretization and correlation-based feature selection. The system encompasses five emotions (disgust, fear, happiness, anger and sadness), using the Emofilm database, comprised of short clips of English movies and the respective Italian and Spanish dubbed versions, for a total of 1115 annotated utterances. The results see MLP as the most effective classifier, with accuracies higher than 90% for single-language approaches, while the cross-language classifier still yields accuracies higher than 80%. The results show cross-gender tasks to be more difficult than those involving two languages, suggesting greater differences between emotions expressed by male versus female subjects than between different languages. Four feature domains, namely, RASTA, F0, MFCC and spectral energy, are algorithmically assessed as the most effective, refining existing literature and approaches based on standard sets. To our knowledge, this is one of the first studies encompassing cross-gender and cross-linguistic assessments on SER.

Costantini, G., Parada-Cabaleiro, E., Casali, D., Cesarini, V. (2022). The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning. SENSORS, 22(7).

The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning

Giovanni Costantini;Daniele Casali;
2022-03-23

Abstract

Machine Learning (ML) algorithms within a human–computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to explore the feasibility and characteristics of a cross-linguistic, cross-gender SER. Three ML classifiers (SVM, Naïve Bayes and MLP) are applied to acoustic features, obtained through a procedure based on Kononenko’s discretization and correlation-based feature selection. The system encompasses five emotions (disgust, fear, happiness, anger and sadness), using the Emofilm database, comprised of short clips of English movies and the respective Italian and Spanish dubbed versions, for a total of 1115 annotated utterances. The results see MLP as the most effective classifier, with accuracies higher than 90% for single-language approaches, while the cross-language classifier still yields accuracies higher than 80%. The results show cross-gender tasks to be more difficult than those involving two languages, suggesting greater differences between emotions expressed by male versus female subjects than between different languages. Four feature domains, namely, RASTA, F0, MFCC and spectral energy, are algorithmically assessed as the most effective, refining existing literature and approaches based on standard sets. To our knowledge, this is one of the first studies encompassing cross-gender and cross-linguistic assessments on SER.
23-mar-2022
Pubblicato
Rilevanza internazionale
Articolo
Esperti anonimi
Settore ING-IND/31 - ELETTROTECNICA
English
Con Impact Factor ISI
speech; emotion recognition; artificial intelligence; English; cross-linguistic; cross-gender; SVM; machine learning; SER
Costantini, G., Parada-Cabaleiro, E., Casali, D., Cesarini, V. (2022). The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning. SENSORS, 22(7).
Costantini, G; Parada-Cabaleiro, E; Casali, D; Cesarini, V
Articolo su rivista
File in questo prodotto:
File Dimensione Formato  
2022-The Emotion Probe. On the Universality of Cross-Linguistic and Cross-Gender.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 814.82 kB
Formato Adobe PDF
814.82 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/294642
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 19
social impact