This paper aims to differentiate causes of dysphonia, namely Reinke's Edema, Vocal Cord Paralysis, and Vocal Nodules, also including healthy subjects. A proprietary dataset of 245 subjects underwent acoustic feature extraction and selection, and four classifiers were trained for multi-class classification. Loudness/Energy-related features were among the most effective, which is in line with the fact that the three diseases all cause different impairments in terms of voice volume. Cepstrum is also confirmed as an effective domain. The four classifiers obtained comparable performances, with Random Forest having the highest accuracy at 78.4% and Naïve Bayes offering the best compromise in terms of recall. Healthy subjects always lead to a higher recall, which is in line with the fact that identifying dysphonia is an easier task than differentiating among its causes.

Cesarini, V., Robotti, C., Costantini, G. (2024). Multi-class machine learning detection of Edema, Vocal Paralysis and Vocal Nodules through voice. In 2024 IEEE International Workshop on Metrology for Industry 4.0 and IoT: proceedings (pp.430-434). New York : IEEE [10.1109/MetroInd4.0IoT61288.2024.10584208].

Multi-class machine learning detection of Edema, Vocal Paralysis and Vocal Nodules through voice

Cesarini V.;Costantini G.
2024-01-01

Abstract

This paper aims to differentiate causes of dysphonia, namely Reinke's Edema, Vocal Cord Paralysis, and Vocal Nodules, also including healthy subjects. A proprietary dataset of 245 subjects underwent acoustic feature extraction and selection, and four classifiers were trained for multi-class classification. Loudness/Energy-related features were among the most effective, which is in line with the fact that the three diseases all cause different impairments in terms of voice volume. Cepstrum is also confirmed as an effective domain. The four classifiers obtained comparable performances, with Random Forest having the highest accuracy at 78.4% and Naïve Bayes offering the best compromise in terms of recall. Healthy subjects always lead to a higher recall, which is in line with the fact that identifying dysphonia is an easier task than differentiating among its causes.
7th IEEE International Workshop on Metrology for Industry 4.0 and IoT (MetroInd4.0 and IoT 2024)
Firenze, Italy
2024
7
IEEE Italy Section Affinity Group of Women in Engineering
Rilevanza internazionale
2024
Settore IIET-01/A - Elettrotecnica
English
Edema
Machine learning
Speech
Voice
Intervento a convegno
Cesarini, V., Robotti, C., Costantini, G. (2024). Multi-class machine learning detection of Edema, Vocal Paralysis and Vocal Nodules through voice. In 2024 IEEE International Workshop on Metrology for Industry 4.0 and IoT: proceedings (pp.430-434). New York : IEEE [10.1109/MetroInd4.0IoT61288.2024.10584208].
Cesarini, V; Robotti, C; Costantini, G
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/404586
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact