International Scientific Journal of Engineering and Management

An International Scholarly || Multidisciplinary || Open Access || Indexing in all major Database & Metadata
The journal follows the UGC Guidelines and is evaluated for inclusion in the Web of Science
ISSN: 2583-6129

Impact Factor: 8.072

IMAGE TO SPEECH CONVERTER SOFTWARE USING DEEP LEARNING

Version
File Size 382.35 KB
Downloads 6
Files 1
Published 7 April 2026
Updated 7 April 2026

IMAGE TO SPEECH CONVERTER SOFTWARE USING DEEP LEARNING:

Authors: 

Chelimila Manasa, M Parushuram, K AYYAPPA

Department of Computer Engineering, Methodist College of Engineering and Technology,

Abids, Hyderabad, Telangana,500001, India.

Dr. Syed Azahad

Department of Computer Engineering, Methodist College of Engineering and Technology, Abids, Hyderabad, Telangana, 500001, India.

ABSTRACT

This project presents a real-time image-to-speech system designed to enhance accessibility for individuals with visual impairments by transforming visual content into meaningful spoken descriptions. The proposed framework leverages deep learning models to interpret images and generate natural language captions that accurately describe the visual scene. Convolutional Neural Networks (CNNs) are utilised to extract detailed and discriminative visual features, forming the foundation for understanding objects and their spatial relationships within an image. These extracted features are then processed by a Long Short-Term Memory (LSTM) network equipped with an attention mechanism, enabling the model to focus on relevant regions of the image while producing contextually rich and coherent textual descriptions.

To convert the generated captions into audible speech, the system incorporates a Text-to-Speech (TTS) engine, completing a seamless pipeline from image acquisition to spoken output. The model is trained and evaluated using the MS-COCO dataset, which provides diverse and complex image-caption pairs. Performance is assessed through widely recognised captioning metrics, including BLEU and METEOR scores, ensuring a reliable evaluation of linguistic accuracy and descriptive quality.

By integrating advanced techniques from computer vision and natural language processing, this system demonstrates the impactful role of artificial intelligence in assistive technology. The architecture is optimised for real-time execution, making it suitable for deployment on low-cost devices and edge-based platforms. Overall, the proposed solution offers a practical, efficient, and scalable tool aimed at improving independence and enhancing the quality of life for visually impaired users.

Keywords: Image Captioning, Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), Attention Mechanism, Deep Learning Computer Vision, Natural Language Processing (NLP), Text-to-Speech (TTS), Assistive Technology, MS-COCO Dataset

Download
or download free
[changelog]

Categories & Tags

Similar Downloads

No related download found!
ISJEM Journal

Author's Blog

What is the difference between a Research Paper and a Review Paper?

A research paper and a review paper are both scholarly documents, but they serve different purposes and have different characteristics....
Read More
Author's Blog

What is DOI?

A Digital Object Identifier (DOI) is a unique alphanumeric string that is used to identify and provide a persistent link...
Read More
Author's Blog

What do you need to do during production of your Research Paper?

During the production of a research paper, the following steps need to be taken: conducting research, organizing and analyzing data,...
Read More
Author's Blog

What are the advantages of publishing a research paper?

Publishing a research paper can have many advantages for researchers, including: Career advancement, professional recognition, opportunities for collaboration, increased visibility,...
Read More
Author's Blog

Ways to Support your Academic Wellbeing which preparing the Research Paper/Article

To support your academic wellbeing while publishing a research paper, it's important to set realistic goals, manage your time effectively,...
Read More
Author's Blog

How to improve your Research Paper writing Skills?

Read extensively: One of the best ways to improve your research paper skills is to read extensively in your field...
Read More
Author's Blog

Is DOI compulsory to publish a research paper in a Journal?

DOI is not strictly required to publish a research paper, but it is highly recommended. Basically, the International Scientific Journal...
Read More
Author's Blog

In what ways does research paper give weight to career development?

Publishing a research paper can give weight to a researcher's career development in several ways, such as: establishing oneself as...
Read More
Author's Blog

How to develop a Research Paper from Scratch

Developing a research paper involves several steps including: choosing a topic, conducting background research, formulating a research question or hypothesis,...
Read More
Author's Blog

How Plagiarism report plays crucial role in Research Paper Publication?

Plagiarism is a major concern in the academic and research community, as it undermines the integrity of the research and...
Read More