Stuttering, also known as stammering, is a speech disorder characterized by involuntary disruptions or disfluencies in a person's flow of speech. These disfluencies may include repetitions of sounds, syllables, or words; prolongations of sounds; and interruptions in speech known as blocks. This paper introduces Unified Neural Network for Integrated Gait and Speech Analysis (UNNIGSA), methodology that synergizes stutter detection (SD) and gait recognition through a unified neural network architecture. UNNIGSA is engineered to address two distinct yet interrelated challenges: the accurate detection of stuttering for enhanced beneficial interventions and the precise identification of individuals based on gait analysis. The system integrates a global attention mechanism to meticulously highlight salient features within speech patterns, thereby improving the accuracy of stutter classification and offering a potential leap forward in speech therapy practices. Additionally, UNNIGSA incorporates novel data processing techniques to manage the class imbalance prevalent in stuttering speech datasets, resulting in significantly enhanced performance over existing models. The methodology also extends the functionality of automatic speech recognition (ASR) systems, fostering greater inclusivity for individuals with speech disorders and enabling their more seamless interaction with virtual assistant technologies. Overall, UNNIGSA sets a new standard in the domains of speech disorder treatment and biometric identification, offering innovative solutions to long-standing challenges and paving the way for more inclusive and secure applications.
Published in | Journal of Electrical and Electronic Engineering (Volume 12, Issue 4) |
DOI | 10.11648/j.jeee.20241204.12 |
Page(s) | 71-83 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2024. Published by Science Publishing Group |
UNNIGSA- Unified Neural Network for Integrated Gait and Speech Analysis, ASR- Automatic Speech Recognition, SD-Stutter Detection, PWS-People Who Stutter, SLP-Speech-Language Pathologists, ST-Speech therapists
[1] | S. A. Sheikh, M. Sahidullah, F. Hirsch and S. Ouni, "Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep Learning," in IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 5, pp. 2553-2564, May 2023, |
[2] | R. Hosseini, B. Walsh, F. Tian and S. Wang, "An fNIRS-Based Feature Learning and Classification Framework to Distinguish Hemodynamic Patterns in Children Who Stutter," in IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 26, no. 6, pp. 1254-1263, June 2018, |
[3] | A. -K. Al-Banna, E. Edirisinghe and H. Fang, "Stuttering Detection Using Atrous Convolutional Neural Networks," 2022 13th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 2022, pp. 252-256, |
[4] | C. Lea, V. Mitra, A. Joshi, S. Kajarekar and J. P. Bigham, "SEP-28k: A Dataset for Stuttering Event Detection from Podcasts with People Who Stutter," ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 2021, pp. 6798-6802, |
[5] | B. Alhalabi, J. Taylor, H. A. Sanghvi and A. S. Pandya, "A Proposed Framework for Stutter Detection: Implementation on Embedded Systems," 2022 IEEE World Conference on Applied Intelligence and Computing (AIC), Sonbhadra, India, 2022, pp. 829-833, |
[6] | J. Zhang, B. Dong and Y. Yan, "A Computer-Assist Algorithm to Detect Repetitive Stuttering Automatically," 2013 International Conference on Asian Language Processing, Urumqi, China, 2013, pp. 249-252, |
[7] | S. A. Waheed and P. S. Abdul Khader, "IoT based approach for detection of dominating emotions in persons who stutter," 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 2020, pp. 14-18, |
[8] | T. Kourkounakis, A. Hajavi and A. Etemad, "FluentNet: End-to-End Detection of Stuttered Speech Disfluencies With Deep Learning," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2986-2999, 2021, |
[9] | S. A. Sheikh, M. Sahidullah, F. Hirsch and S. Ouni, "Robust Stuttering Detection via Multi-task and Adversarial Learning," 2022 30th European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 2022, pp. 190-194, |
[10] | K. Li et al., “Applying multivariate segmentation methods to human activity recognition from wearable sensors’ data,” JMIR mHealth uHealth, vol. 7, no. 2, Feb. 2019, Art. no. e11201, |
[11] | H. Geng, Z. Huan, J. Liang, Z. Hou, S. Lv and Y. Wang, "Segmentation and Recognition Model for Complex Action Sequences," in IEEE Sensors Journal, vol. 22, no. 5, pp. 4347-4358, 1 March1, 2022, |
[12] | Z. Wang, S. Hou, M. Zhang, X. Liu, C. Cao and Y. Huang, "GaitParsing: Human Semantic Parsing for Gait Recognition," in IEEE Transactions on Multimedia, vol. 26, pp. 4736-4748, 19 October 2023, |
[13] | A. N. Tarekegn, M. Sajjad, F. A. Cheikh, M. Ullah and K. Muhammad, "Efficient Human Gait Activity Recognition based on Sensor Fusion and Intelligent Stacking Framework," in IEEE Sensors Journal, vol. 23, Issue. 22, pp. 28355-28369, 02 October 2023, |
[14] | A. Smith and C. Weber, “How stuttering develops: The multifactorial dynamic pathways theory,” JSLHR, vol. 60, no. 9, pp. 2483–2505, 2017, |
[15] | V. Mitra et al., “Analysis and tuning of a voice assistant system for dysfluent speech,” in Proc. Interspeech2021, 2021, pp. 4848–4852, |
[16] | L. Verde, G. De Pietro and G. Sannino, “Voice disorder identification by using machine learning techniques,” IEEE Access, vol. 6, pp. 16246–16255, 2018, |
[17] | N. P. Narendra and Paavo Alku. 2019. Dysarthric speech classification from coded telephone speech using glottal features. Speech Commun. 110, C (Jul 2019), 47–55. |
[18] | C. Quan, K. Ren and Z. Luo, "A Deep Learning Based Method for Parkinson’s Disease Detection Using Dynamic Features of Speech," in IEEE Access, vol. 9, pp. 10239-10252, 2021, |
[19] | S. Alharbi et al., “A lightly supervised approach to detect stuttering in children’s speech,” in Proc. Interspeech2018, pp. 3433–3437, |
[20] | Shiqi Yu, Daoliang Tan and Tieniu Tan, "A Framework for Evaluating the Effect of View Angle, Clothing and Carrying Condition on Gait Recognition," 18th International Conference on Pattern Recognition (ICPR'06), Hong Kong, China, 2006, pp. 441-444, |
[21] | Peng, Y., Ma, K., Zhang, Y. et al. Learning rich features for gait recognition by integrating skeletons and silhouettes. Multimed Tools Appl (2023). |
[22] | B. Lin, S. Zhang and X. Yu, "Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021, pp. 14628-14636, |
[23] | C. Fan et al., "GaitPart: Temporal Part-Based Model for Gait Recognition," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 14213-14221, |
[24] | H. -M. Hsu, Y. Wang, C. -Y. Yang, J. -N. Hwang, H. L. U. Thuc and K. -J. Kim, "Learning Temporal Attention Based Keypoint-Guided Embedding for Gait Recognition," in IEEE Journal of Selected Topics in Signal Processing, vol. 17, no. 3, pp. 689-698, May 2023, |
[25] | Beibei Lin, Shunli Zhang, and Feng Bao. 2020. Gait Recognition with Multiple-Temporal-Scale 3D Convolutional Neural Network. In Proceedings of the 28th ACM International Conference on Multimedia (MM '20). Association for Computing Machinery, New York, NY, USA, 3054–3062. |
[26] | H. Chao, Y. He, J. Zhang, and J. Feng, “GaitSet: Regarding gait as a set for cross-view gait recognition,” in Proc. AAAI Conf. Artif. Intell., 2019, pp. 8126–8133, |
[27] | Rubén San-Segundo, Jaime Lorenzo-Trueba, Beatriz Martínez-González, and José M. Pardo. 2016. Segmenting human activities based on HMMs using smartphone inertial sensors. Pervasive Mob. Comput. 30, C (August 2016), 84–96. |
APA Style
Reddy, R. R., Gangadharaih, S. K. (2024). UNNIGSA: A Unified Neural Network Approach for Enhanced Stutter Detection and Gait Recognition Analysis. Journal of Electrical and Electronic Engineering, 12(4), 71-83. https://doi.org/10.11648/j.jeee.20241204.12
ACS Style
Reddy, R. R.; Gangadharaih, S. K. UNNIGSA: A Unified Neural Network Approach for Enhanced Stutter Detection and Gait Recognition Analysis. J. Electr. Electron. Eng. 2024, 12(4), 71-83. doi: 10.11648/j.jeee.20241204.12
AMA Style
Reddy RR, Gangadharaih SK. UNNIGSA: A Unified Neural Network Approach for Enhanced Stutter Detection and Gait Recognition Analysis. J Electr Electron Eng. 2024;12(4):71-83. doi: 10.11648/j.jeee.20241204.12
@article{10.11648/j.jeee.20241204.12, author = {Ravikiran Reddappa Reddy and Santhosh Kumar Gangadharaih}, title = {UNNIGSA: A Unified Neural Network Approach for Enhanced Stutter Detection and Gait Recognition Analysis }, journal = {Journal of Electrical and Electronic Engineering}, volume = {12}, number = {4}, pages = {71-83}, doi = {10.11648/j.jeee.20241204.12}, url = {https://doi.org/10.11648/j.jeee.20241204.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.jeee.20241204.12}, abstract = {Stuttering, also known as stammering, is a speech disorder characterized by involuntary disruptions or disfluencies in a person's flow of speech. These disfluencies may include repetitions of sounds, syllables, or words; prolongations of sounds; and interruptions in speech known as blocks. This paper introduces Unified Neural Network for Integrated Gait and Speech Analysis (UNNIGSA), methodology that synergizes stutter detection (SD) and gait recognition through a unified neural network architecture. UNNIGSA is engineered to address two distinct yet interrelated challenges: the accurate detection of stuttering for enhanced beneficial interventions and the precise identification of individuals based on gait analysis. The system integrates a global attention mechanism to meticulously highlight salient features within speech patterns, thereby improving the accuracy of stutter classification and offering a potential leap forward in speech therapy practices. Additionally, UNNIGSA incorporates novel data processing techniques to manage the class imbalance prevalent in stuttering speech datasets, resulting in significantly enhanced performance over existing models. The methodology also extends the functionality of automatic speech recognition (ASR) systems, fostering greater inclusivity for individuals with speech disorders and enabling their more seamless interaction with virtual assistant technologies. Overall, UNNIGSA sets a new standard in the domains of speech disorder treatment and biometric identification, offering innovative solutions to long-standing challenges and paving the way for more inclusive and secure applications. }, year = {2024} }
TY - JOUR T1 - UNNIGSA: A Unified Neural Network Approach for Enhanced Stutter Detection and Gait Recognition Analysis AU - Ravikiran Reddappa Reddy AU - Santhosh Kumar Gangadharaih Y1 - 2024/09/26 PY - 2024 N1 - https://doi.org/10.11648/j.jeee.20241204.12 DO - 10.11648/j.jeee.20241204.12 T2 - Journal of Electrical and Electronic Engineering JF - Journal of Electrical and Electronic Engineering JO - Journal of Electrical and Electronic Engineering SP - 71 EP - 83 PB - Science Publishing Group SN - 2329-1605 UR - https://doi.org/10.11648/j.jeee.20241204.12 AB - Stuttering, also known as stammering, is a speech disorder characterized by involuntary disruptions or disfluencies in a person's flow of speech. These disfluencies may include repetitions of sounds, syllables, or words; prolongations of sounds; and interruptions in speech known as blocks. This paper introduces Unified Neural Network for Integrated Gait and Speech Analysis (UNNIGSA), methodology that synergizes stutter detection (SD) and gait recognition through a unified neural network architecture. UNNIGSA is engineered to address two distinct yet interrelated challenges: the accurate detection of stuttering for enhanced beneficial interventions and the precise identification of individuals based on gait analysis. The system integrates a global attention mechanism to meticulously highlight salient features within speech patterns, thereby improving the accuracy of stutter classification and offering a potential leap forward in speech therapy practices. Additionally, UNNIGSA incorporates novel data processing techniques to manage the class imbalance prevalent in stuttering speech datasets, resulting in significantly enhanced performance over existing models. The methodology also extends the functionality of automatic speech recognition (ASR) systems, fostering greater inclusivity for individuals with speech disorders and enabling their more seamless interaction with virtual assistant technologies. Overall, UNNIGSA sets a new standard in the domains of speech disorder treatment and biometric identification, offering innovative solutions to long-standing challenges and paving the way for more inclusive and secure applications. VL - 12 IS - 4 ER -