Bolingo: A Real-Time AI-Powered Multilingual Communication Platform for Indian Languages
Bolingo: A Real-Time AI-Powered Multilingual Communication Platform for Indian Languages
Prof. Bhavana A. Khivsara, Kajal Rajendra Kadam∗, Roshni Pravin Ahire∗, Aryan Nitin Jain∗, Rohak Rahul Bora∗
∗Department of Computer Engineering, SNJB’s Late Sau. K. B. Jain College of Engineering, Chandwad, India Email: {bhavana.khivsara, kajalkadam737, roshniahire1702, aryanj1084, borarohak}@gmail.com
Abstract—Effective communication is essential for mutual un- derstanding and collaboration in today’s globally interconnected world. However, linguistic disparities frequently emerge as signif- icant barriers that constrain opportunities for meaningful inter- action. The scarcity of robust speech technology infrastructure— including automatic speech recognition (ASR), neural machine translation (NMT), and text-to-speech (TTS) systems—for Indian regional languages remains a critical impediment. To address this, we present Bolingo, a real-time, AI-powered multilingual com- munication platform that enables seamless cross-lingual video conferencing. Bolingo integrates dual-provider streaming speech- to-text (Sarvam AI for 23 Indian languages, Soniox for 60+ global languages), Azure Cognitive Services for neural machine translation and text-to-speech synthesis, and LiveKit WebRTC for sub-second audio/video streaming—all orchestrated within a Next.js 14 full-stack architecture deployed on Vercel Cloud with a Supabase PostgreSQL backend. Our system achieves an end- to-end speech-to-translated-speech pipeline latency of approxi- mately 1.8–3.2 seconds, supports 71 languages with bidirectional translation, and employs intelligent provider routing, queue- based TTS synthesis, and data-channel caption broadcasting to deliver a natural conversational experience. Experimental evaluation across latency, word error rate, translation accuracy, and concurrent user scalability demonstrates Bolingo’s viability as a production-grade multilingual communication solution for India’s linguistically diverse population.
Index Terms—Multilingual Communication, Speech-to-Text, Text-to-Speech, Neural Machine Translation, Indian Languages, Real-Time Translation, WebRTC, LiveKit, Streaming ASR.