Audio/Video Transcriber Using NMT in Different Languages
- Version
- Download 52
- File Size 595.03 KB
- File Count 1
- Create Date 2 July 2024
- Last Updated 2 July 2024
Audio/Video Transcriber Using NMT in Different Languages
Dr Sayyada Fahmeeda Sultana1, Arusa Konain2, Bhavana Reddy3, Neelambika Kolar4
, #1Department of Computer Science, Visveswaraya Technological University PDA Kalaburagi, Karnataka, India Email: arusakonain09@gmail.com
*2Department of Computer Science, Visveswaraya Technological University PDA Kalaburagi, Karnataka, India Email: bhavanareddy250@gmail.com
#3Department of Computer Science, Visveswaraya Technological University PDA Kalaburagi, Karnataka, India Email: neelambikaskolar5167 @gmail.com
Abstract— Multimedia data is represented as electronic signals that can be recorded, processed, and reproduced. The detection and extraction of scene and caption text from unconstrained, educational video is an important research problem in the context of content-based retrieval. The project presents a reliable system for detecting, localizing, extracting, tracking and binarizing text from unconstrained, educational video. The features of speech differ with each language, even while communicating in the same language, the pace and the dialect varies with each person. Speech recognition which is an inter disciplinary field of computational linguistics aids in developing technologies that empowers the recognition and translation of speech into text. The project proposed to transcribe and translate educational audio/videos in different regional languages. Audio files are transcribed into text using Natural Language processing (NLP) techniques like Flask, Speech recognition, Pytest, Gunicorn. The translation is performed using Neural Machine Translation (NMT). As education videos contain presentations which include important point those important points are extracted by using Optical Character Recognition (OCR). The project focuses on NPTEL videos, educational and news video. Neural Machine Translation (NMT) for translating text into different languages i.e. the neural network is trained on vast amounts of multilingual text data. The objective is to achieve time efficiency and accuracy in transcription.
Keywords— NLP (Natural Language Processing), OCR (Optical Character Recognition), Flask, Speech Recognition, NMT (Neural Machine Translation).
Download