This paper aims to differentiate causes of dysphonia, namely Reinke's Edema, Vocal Cord Paralysis, and Vocal Nodules, also including healthy subjects. A proprietary dataset of 245 subjects underwent acoustic feature extraction and selection, and four classifiers were trained for multi-class classification. Loudness/Energy-related features were among the most effective, which is in line with the fact that the three diseases all cause different impairments in terms of voice volume. Cepstrum is also confirmed as an effective domain. The four classifiers obtained comparable performances, with Random Forest having the highest accuracy at 78.4% and Naïve Bayes offering the best compromise in terms of recall. Healthy subjects always lead to a higher recall, which is in line with the fact that identifying dysphonia is an easier task than differentiating among its causes.
Cesarini, V., Robotti, C., Costantini, G. (2024). Multi-class machine learning detection of Edema, Vocal Paralysis and Vocal Nodules through voice. In 2024 IEEE International Workshop on Metrology for Industry 4.0 and IoT: proceedings (pp.430-434). New York : IEEE [10.1109/MetroInd4.0IoT61288.2024.10584208].
Multi-class machine learning detection of Edema, Vocal Paralysis and Vocal Nodules through voice
Cesarini V.;Costantini G.
2024-01-01
Abstract
This paper aims to differentiate causes of dysphonia, namely Reinke's Edema, Vocal Cord Paralysis, and Vocal Nodules, also including healthy subjects. A proprietary dataset of 245 subjects underwent acoustic feature extraction and selection, and four classifiers were trained for multi-class classification. Loudness/Energy-related features were among the most effective, which is in line with the fact that the three diseases all cause different impairments in terms of voice volume. Cepstrum is also confirmed as an effective domain. The four classifiers obtained comparable performances, with Random Forest having the highest accuracy at 78.4% and Naïve Bayes offering the best compromise in terms of recall. Healthy subjects always lead to a higher recall, which is in line with the fact that identifying dysphonia is an easier task than differentiating among its causes.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.