A Review of Retrieval-Augmented Generation for University-Specific Chatbot Systems
A Review of Retrieval-Augmented Generation for University-Specific Chatbot Systems
1st Syed Irfan Ali
Artificial Intelligence and Data Science
Anjuman College of Engineering and Technology
Nagpur, India 0000-0002-0280-0138
4th M. Huzaifa Ansari
Artificial Intelligence and Data Science
Anjuman College of Engineering and Technology
Nagpur, India 0009-0006-7170-3880
2nd Hasan Laheri
Artificial Intelligence and Data Science
Anjuman College of Engineering and Technology
Nagpur, India 0009-0002-9063-4762
5th M. Huzaif Ansari
Artificial Intelligence and Data Science
Anjuman College of Engineering and Technology
Nagpur, India 0009-0005-6325-1856
3rd Sanchit Bhajikhaye
Artificial Intelligence and Data Science
Anjuman College of Engineering and Technology
Nagpur, India 0009-0007-3162-5694
6th M. Bilal Khan
Artificial Intelligence and Data Science
Anjuman College of Engineering and Technology
Nagpur, India 0009-0001-2249-4975
Abstract— The rapid advancement of Artificial Intelligence (AI) and Natural Language Processing (NLP) has made Large Language Models (LLMs) pivotal in educational question-answering systems, particularly for university admission chatbots [1]. However, LLMs face critical challenges such as generating hallucinations, relying on outdated knowledge, and having non-transparent reasoning processes [10]. To address this, Retrieval-Augmented Generation (RAG) has emerged as a promising solution, incorporating knowledge from external databases to enhance the accuracy and credibility of generated responses [10]. This paper reviews the architecture and application of RAG-powered chatbots (RAGBots) designed for specific university domains [1]. A key finding is that while RAG systems like URAG, SAMCares, and Infersity v1 demonstrate utility in providing intelligent access to university resources [1, 3, 4], datasets for such closed domains are still difficult to obtain and curate [2]. Furthermore, complex RAG implementations often involve high operational costs and specialized modules [1]. The work highlights enhancements like Multi-Query and Ensemble Retrieval [6] and discusses critical challenges such as Document-Level Retrieval Mismatch (DRM) [8], concluding with a vision for reliable, domain-specific RAGBots in higher education.Keywords— Retrieval-Augmented Generation (RAG), Chatbots, University Systems, Large Language Models (LLMs) [3, 6], Educational Technology, AI in Higher Education.