Automatic Voice pathology detection using Deep Learning Techniques

Robert-Valentin Bencze

Issue

Vol. 1 No. 1 (2022): Proceedings of the 5th International Conference XGEN

Issue Published : October 11, 2022

Date Log

Submitted

November 19, 2022

Published

October 11, 2023

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Automatic Voice pathology detection using Deep Learning Techniques

Robert-Valentin Bencze

Universitatea Politehnica București, Facultatea de Electronică, Telecomunicații și Tehnologia Informației

Corresponding Author(s) : Robert-Valentin Bencze

benczejrobert@gmail.com

Student Thinkers and Advanced Research, Vol. 1 No. 1 (2022): Proceedings of the 5th International Conference XGEN
Article Published : October 11, 2023

Abstract

Pathologies that affect the vocal tract are known to alter the quality of the patients' speech in distinguishable or more subtle ways. This paper presents a non-invasive multi-class automatic vocal pathology detector that outperforms the mean accuracy of medical professionals by analyzing the less distinguishable features of pathological voices. The proposed system uses a deep learning algorithm capable to analyze multiple pathologies based on speech signals that consist in simple vowel utterances recorded in either audio or electroglottographic (EGG) format. The classifier obtained 86% and 75% accuracies for the audio and EGG separately, while their simultaneous analysis yielded 95% accuracy.

Keywords

Machine Learning Deep learning Automatic Voice Pathology Detection Multi-class Voice Pathology detection Învățare automată Învățare profundă Detectare automată a patologiei vocii detectarea patologiei vocii cu mai multe clase

Bencze, R.-V. (2023). Automatic Voice pathology detection using Deep Learning Techniques. Student Thinkers and Advanced Research, 1(1), 10. Retrieved from https://opacj.org/star/article/view/85

Download Citation

References

Jonathan G Richens, Ciarán M Lee, and Saurabh Johri. Improving the accuracy of medical diagnosis with causal machine learning. Nature communications, 11(1):1–9, 2020.
Fahad Al-Dhief, Nurul Muazzah Abdul Latiff, Marina Mat Baki, Nik Noordini Nik Abd Malik, Naseer Sabri, and Musatafa Albadr. Voice pathology detection using support vector machine based on different number of voice signals. 11 2021.
Meisam Khalil Arjmandi, Mohammd Pooyan, Hojat Mohammadnejad, and Mansour Vali. Voice disorders identification based on different feature reduction methodologies and support vector machine. In 2010 18th Iranian Conference on Electrical Engineering, pages 45–49. IEEE, 2010.
Fahad Taha Al-Dhief, Marina Mat Baki, Nurul Mu’azzah Abdul Latiff, Nik Noordini Nik Abd Malik, Naseer Sabri Salim, Musatafa Abbas Abbood Albader, Nor Muzlifah Mahyuddin, and Mazin Abed Mohammed. Voice pathology detection and classification by adopting online sequential extreme learning machine. IEEE Access, 9:77293–77306, 2021.
Everthon Silva Fonseca, Rodrigo Capobianco Guido, Sylvio Barbon Junior, Henrique Dezani, Rodrigo Rosseto Gati, and Denis César Mosconi Pereira. Acoustic investigation of speech pathologies based on the discriminative paraconsistent machine (dpm). Biomedical Signal Processing and Control, 55:101615, 2020.
Farika Putri, Wahyu Caesarendra, Elta Diah Pamanasari, Mochammad Ariyanto, and Joga D Setiawan. Parkinson disease detection based on voice and emg pattern classification method for indonesian case study. JEMMME (Journal of Energy, Mechanical, Material, and Manufacturing Engineering), 3(2):87–98, 2018.
Ben Maassen, Raymond Kent, and Hermann Peters. Speech motor control: In normal and disordered speech. Oxford University Press, 2007.
ICspeech. Portable electroglottography system, https://icspeech.com/electroglottography.html, 2009.
Bogdan Woldert-Jokisz. Saarbruecken voice database. -, 2007.
Olaide Agbolade. Vowels and prosody contribution in neural network based voice conversion algorithm with noisy training data. arXiv preprint arXiv:2003.04640, 2020.
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
Chris Donahue, Julian McAuley, and Miller Puckette. Adversarial audio synthesis. arXiv preprint arXiv:1802.04208, 2018.

References

Jonathan G Richens, Ciarán M Lee, and Saurabh Johri. Improving the accuracy of medical diagnosis with causal machine learning. Nature communications, 11(1):1–9, 2020.

Fahad Al-Dhief, Nurul Muazzah Abdul Latiff, Marina Mat Baki, Nik Noordini Nik Abd Malik, Naseer Sabri, and Musatafa Albadr. Voice pathology detection using support vector machine based on different number of voice signals. 11 2021.

Meisam Khalil Arjmandi, Mohammd Pooyan, Hojat Mohammadnejad, and Mansour Vali. Voice disorders identification based on different feature reduction methodologies and support vector machine. In 2010 18th Iranian Conference on Electrical Engineering, pages 45–49. IEEE, 2010.

Fahad Taha Al-Dhief, Marina Mat Baki, Nurul Mu’azzah Abdul Latiff, Nik Noordini Nik Abd Malik, Naseer Sabri Salim, Musatafa Abbas Abbood Albader, Nor Muzlifah Mahyuddin, and Mazin Abed Mohammed. Voice pathology detection and classification by adopting online sequential extreme learning machine. IEEE Access, 9:77293–77306, 2021.

Everthon Silva Fonseca, Rodrigo Capobianco Guido, Sylvio Barbon Junior, Henrique Dezani, Rodrigo Rosseto Gati, and Denis César Mosconi Pereira. Acoustic investigation of speech pathologies based on the discriminative paraconsistent machine (dpm). Biomedical Signal Processing and Control, 55:101615, 2020.

Farika Putri, Wahyu Caesarendra, Elta Diah Pamanasari, Mochammad Ariyanto, and Joga D Setiawan. Parkinson disease detection based on voice and emg pattern classification method for indonesian case study. JEMMME (Journal of Energy, Mechanical, Material, and Manufacturing Engineering), 3(2):87–98, 2018.

Ben Maassen, Raymond Kent, and Hermann Peters. Speech motor control: In normal and disordered speech. Oxford University Press, 2007.

ICspeech. Portable electroglottography system, https://icspeech.com/electroglottography.html, 2009.

Bogdan Woldert-Jokisz. Saarbruecken voice database. -, 2007.

Olaide Agbolade. Vowels and prosody contribution in neural network based voice conversion algorithm with noisy training data. arXiv preprint arXiv:2003.04640, 2020.

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

Chris Donahue, Julian McAuley, and Miller Puckette. Adversarial audio synthesis. arXiv preprint arXiv:1802.04208, 2018.

Author biographies is not available.

Download this PDF file

PDF

Issue

Vol. 1 No. 1 (2022): Proceedings of the 5th International Conference XGEN

Date Log

Automatic Voice pathology detection using Deep Learning Techniques

Corresponding Author(s) : Robert-Valentin Bencze

Abstract

Keywords

Full Article

Download Citation

References

Table Of Contents