Sign Language to Speech Conversion Using Machine Learning
Sign Language to Speech Conversion Using Machine Learning
Authors:
M.Bharath1, Ch. Deviprasad2, S.Abhishek Goud3, G. Mayank4, K. Likhith5, Mr. E. Kiran Kumar6
Student1-5 BTech (CSE) From Sphoorthy Engineering College, Hyderabad.
Assistant Professor6, Dep of CSE, Sphoorthy Engineering College, Hyderabad.
ABSTRACT
Sign language constitutes the principal mode of communication for millions of hearing-impaired individuals worldwide; however, its limited comprehension among the general population perpetuates a significant communication divide. This study introduces a computationally efficient, real-time, vision-based sign language recognition framework that translates hand gestures into both textual and auditory outputs.
The proposed methodology leverages MediaPipe-based 3D hand landmark extraction, which is subsequently transformed into a background-invariant skeletal representation, thereby eliminating environmental dependencies such as illumination variability and visual clutter. A two-tier hierarchical classification paradigm is employed, wherein a Convolutional Neural Network (CNN) performs coarse-grained classification across structural gesture groups, followed by deterministic geometric inference for fine-grained character discrimination.
Empirical evaluations demonstrate superior performance with 99.1% accuracy in controlled settings and 96.9% in unconstrained environments, significantly outperforming traditional RGB-based models. The system achieves real-time execution at approximately 50 FPS on standard CPU hardware, integrating multilingual translation and text-to-speech synthesis.
This framework presents a scalable, cost-effective, and highly accessible solution for assistive communication technologies, with substantial implications for inclusive human-computer interaction systems.
Keywords
Sign Language Recognition, Deep Learning, MediaPipe, CNN, Human-Computer Interaction, Assistive Technology, Real-Time Systems