Image-to-Text-Speech Converter
- Version
- Download 1
- File Size 334.96 KB
- File Count 1
- Create Date 6 June 2025
- Last Updated 6 June 2025
Image-to-Text-Speech Converter
Prathamesh Bothe, Dipali Bhusari
PG Student, Department of MCA, Trinity Academy of Engineering, Pune, India
Guide, Department of MCA, Trinity Academy of Engineering, Pune, India
ABSTRACT
This paper presents the development of an Image-to-Text-Speech Converter, a Python-based assistive system designed to extract and vocalize text from images. The primary goal of the project is to support visually impaired individuals and those with reading difficulties by converting printed or digital text into clear, audible speech. The system integrates Optical Character Recognition (OCR) using Tesseract with Text-to-Speech (TTS) synthesis using tools like pyttsx3 or gTTS. Preprocessing techniques such as grayscale conversion, noise reduction, and thresholding are applied through OpenCV to enhance OCR accuracy across various image conditions.
The converter provides a simple and user-friendly interface, supporting multiple image formats and capable of functioning both online and offline. It effectively bridges the gap between visual content and auditory accessibility, making it suitable for use in reading printed documents, signboards, or
labels. This paper explores the system’s architecture, implementation details, and practical applications, while also addressing the challenges encountered and potential areas for future enhancement, including real-time camera integration and multilingual support.
Download