Date Log
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Automatic Voice pathology detection using Deep Learning Techniques
Corresponding Author(s) : Robert-Valentin Bencze
Student Thinkers and Advanced Research,
Vol. 1 No. 1 (2022): Proceedings of the 5th International Conference XGEN
Abstract
Pathologies that affect the vocal tract are known to alter the quality of the patients' speech in distinguishable or more subtle ways. This paper presents a non-invasive multi-class automatic vocal pathology detector that outperforms the mean accuracy of medical professionals by analyzing the less distinguishable features of pathological voices. The proposed system uses a deep learning algorithm capable to analyze multiple pathologies based on speech signals that consist in simple vowel utterances recorded in either audio or electroglottographic (EGG) format. The classifier obtained 86% and 75% accuracies for the audio and EGG separately, while their simultaneous analysis yielded 95% accuracy.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- Jonathan G Richens, Ciarán M Lee, and Saurabh Johri. Improving the accuracy of medical diagnosis with causal machine learning. Nature communications, 11(1):1–9, 2020.
- Fahad Al-Dhief, Nurul Muazzah Abdul Latiff, Marina Mat Baki, Nik Noordini Nik Abd Malik, Naseer Sabri, and Musatafa Albadr. Voice pathology detection using support vector machine based on different number of voice signals. 11 2021.
- Meisam Khalil Arjmandi, Mohammd Pooyan, Hojat Mohammadnejad, and Mansour Vali. Voice disorders identification based on different feature reduction methodologies and support vector machine. In 2010 18th Iranian Conference on Electrical Engineering, pages 45–49. IEEE, 2010.
- Fahad Taha Al-Dhief, Marina Mat Baki, Nurul Mu’azzah Abdul Latiff, Nik Noordini Nik Abd Malik, Naseer Sabri Salim, Musatafa Abbas Abbood Albader, Nor Muzlifah Mahyuddin, and Mazin Abed Mohammed. Voice pathology detection and classification by adopting online sequential extreme learning machine. IEEE Access, 9:77293–77306, 2021.
- Everthon Silva Fonseca, Rodrigo Capobianco Guido, Sylvio Barbon Junior, Henrique Dezani, Rodrigo Rosseto Gati, and Denis César Mosconi Pereira. Acoustic investigation of speech pathologies based on the discriminative paraconsistent machine (dpm). Biomedical Signal Processing and Control, 55:101615, 2020.
- Farika Putri, Wahyu Caesarendra, Elta Diah Pamanasari, Mochammad Ariyanto, and Joga D Setiawan. Parkinson disease detection based on voice and emg pattern classification method for indonesian case study. JEMMME (Journal of Energy, Mechanical, Material, and Manufacturing Engineering), 3(2):87–98, 2018.
- Ben Maassen, Raymond Kent, and Hermann Peters. Speech motor control: In normal and disordered speech. Oxford University Press, 2007.
- ICspeech. Portable electroglottography system, https://icspeech.com/electroglottography.html, 2009.
- Bogdan Woldert-Jokisz. Saarbruecken voice database. -, 2007.
- Olaide Agbolade. Vowels and prosody contribution in neural network based voice conversion algorithm with noisy training data. arXiv preprint arXiv:2003.04640, 2020.
- Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Chris Donahue, Julian McAuley, and Miller Puckette. Adversarial audio synthesis. arXiv preprint arXiv:1802.04208, 2018.
References
Jonathan G Richens, Ciarán M Lee, and Saurabh Johri. Improving the accuracy of medical diagnosis with causal machine learning. Nature communications, 11(1):1–9, 2020.
Fahad Al-Dhief, Nurul Muazzah Abdul Latiff, Marina Mat Baki, Nik Noordini Nik Abd Malik, Naseer Sabri, and Musatafa Albadr. Voice pathology detection using support vector machine based on different number of voice signals. 11 2021.
Meisam Khalil Arjmandi, Mohammd Pooyan, Hojat Mohammadnejad, and Mansour Vali. Voice disorders identification based on different feature reduction methodologies and support vector machine. In 2010 18th Iranian Conference on Electrical Engineering, pages 45–49. IEEE, 2010.
Fahad Taha Al-Dhief, Marina Mat Baki, Nurul Mu’azzah Abdul Latiff, Nik Noordini Nik Abd Malik, Naseer Sabri Salim, Musatafa Abbas Abbood Albader, Nor Muzlifah Mahyuddin, and Mazin Abed Mohammed. Voice pathology detection and classification by adopting online sequential extreme learning machine. IEEE Access, 9:77293–77306, 2021.
Everthon Silva Fonseca, Rodrigo Capobianco Guido, Sylvio Barbon Junior, Henrique Dezani, Rodrigo Rosseto Gati, and Denis César Mosconi Pereira. Acoustic investigation of speech pathologies based on the discriminative paraconsistent machine (dpm). Biomedical Signal Processing and Control, 55:101615, 2020.
Farika Putri, Wahyu Caesarendra, Elta Diah Pamanasari, Mochammad Ariyanto, and Joga D Setiawan. Parkinson disease detection based on voice and emg pattern classification method for indonesian case study. JEMMME (Journal of Energy, Mechanical, Material, and Manufacturing Engineering), 3(2):87–98, 2018.
Ben Maassen, Raymond Kent, and Hermann Peters. Speech motor control: In normal and disordered speech. Oxford University Press, 2007.
ICspeech. Portable electroglottography system, https://icspeech.com/electroglottography.html, 2009.
Bogdan Woldert-Jokisz. Saarbruecken voice database. -, 2007.
Olaide Agbolade. Vowels and prosody contribution in neural network based voice conversion algorithm with noisy training data. arXiv preprint arXiv:2003.04640, 2020.
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
Chris Donahue, Julian McAuley, and Miller Puckette. Adversarial audio synthesis. arXiv preprint arXiv:1802.04208, 2018.