Enhancing Image Captioning Through Augmented Visual Comprehension with CNN
- Version
- Download 8
- File Size 532.32 KB
- File Count 1
- Create Date 30 January 2026
- Last Updated 30 January 2026
Enhancing Image Captioning Through Augmented Visual Comprehension with CNN
R.L.Pavan Kumar 1 , T.V.D.S.Sreyanth 2 , P.Nithin Sai 3 and G .Surendra 5
1B.tech Student1, Koneru Lakshmaiah Education Foundation, Vaddeswaram, A.P., 522302, India.
1*2100080168@kluniversity.in
1B.tech Student2, Koneru Lakshmaiah Education Foundation, Vaddeswaram, A.P., 522302, India.
1*2100080197@kluniversity.in
1B.tech Student3, Koneru Lakshmaiah Education Foundation, Vaddeswaram, A.P., 522302, India.
1*2100080203@kluniversity.in
2Assisstant Professor2, Department of AI & DS , Koneru Lakshmaiah Education Foundation, Vaddeswaram,
A.P., 522302, India.
2guntisurendra@kluniversity.in
ABSTRACT:
Deep Learning and Computer Vision technologies are expanding quickly, and the challenge of automatically generating informative photo captions has received considerable attention. As discoveries continue to reshape the artificial intelligence landscape, the demand for intelligent systems capable of contextualizing visual content with descriptive captions is growing. Image Captioning is a fascinating area of research that intersects computer vision and deep learning techniques. This research paper explores the application of deep learning to the task of generating descriptive captions for images. The proposed model is extended to integrate YOLO-based object detection which is
incorporated into the feature extraction process, thus increasing the robustness of the image representation. The architecture includes the
integration of onvolutional Neural Networks (LSTM) for feature extraction from images and RNNs for language modeling. The CNN extracts meaningful visual features from images. Attention methods are used to address the issue of matching linguistic and visual information. This enables the model to concentrate on distinct areas of the image while generating captions.
Keywords- CNN, LSTM, YOLO, BLEU
INTRODUCTION:
Download