Dublin Core
Title
Retrieval-Augmented Generation System for Efficient Access to University Information
Abstract
Accessing university-related information, such as course syllabi, class schedules, and announcements, can be inefficient and fragmented, especially when data is distributed across multiple platforms. This often leads to time-consuming searches and a suboptimal user experience for both students and staff. This project presents a solution in the form of an intelligent, centralized platform capable of understanding natural language queries and delivering accurate, contextually relevant responses.
It implements a Retrieval-Augmented Generation (RAG) system designed to streamline information access at International Burch University. The system features a multi-layered architecture combining semantic search and generative AI. Vespa.ai serves as the vector database enabling high-performance similarity search, while OpenAI’s GPT models handle natural language understanding and generation. MongoDB is used for session tracking and user state management. Documents in PDF, TXT, and web-based formats are ingested through a pipeline that performs scraping, text extraction, and chunking using LangChain’s RecursiveCharacterTextSplitter. The system is accessible via a modern web interface built with React.js, Vite, and Tailwind CSS, and also exposes a FastAPI-based REST API for backend interaction. Secure access is enforced using JWT-based authentication and authorization. The chatbot component retrieves relevant context through semantic search and maintains coherent multi-turn conversations using conversation history tracking.
The developed system significantly improves the accessibility and efficiency of retrieving university information. It provides fast, context-aware responses to user queries by combining robust retrieval techniques with generative language models.
The web-based interface ensures ease of use for both technical and non-technical users, while the modular backend supports scalability and maintainability. Intelligent query rephrasing and optimized chunk retrieval contribute to improved precision and user satisfaction. Overall, this solution demonstrates how RAG-based systems can transform information access in academic environments by offering a centralized, intelligent platform tailored to users' needs.
It implements a Retrieval-Augmented Generation (RAG) system designed to streamline information access at International Burch University. The system features a multi-layered architecture combining semantic search and generative AI. Vespa.ai serves as the vector database enabling high-performance similarity search, while OpenAI’s GPT models handle natural language understanding and generation. MongoDB is used for session tracking and user state management. Documents in PDF, TXT, and web-based formats are ingested through a pipeline that performs scraping, text extraction, and chunking using LangChain’s RecursiveCharacterTextSplitter. The system is accessible via a modern web interface built with React.js, Vite, and Tailwind CSS, and also exposes a FastAPI-based REST API for backend interaction. Secure access is enforced using JWT-based authentication and authorization. The chatbot component retrieves relevant context through semantic search and maintains coherent multi-turn conversations using conversation history tracking.
The developed system significantly improves the accessibility and efficiency of retrieving university information. It provides fast, context-aware responses to user queries by combining robust retrieval techniques with generative language models.
The web-based interface ensures ease of use for both technical and non-technical users, while the modular backend supports scalability and maintainability. Intelligent query rephrasing and optimized chunk retrieval contribute to improved precision and user satisfaction. Overall, this solution demonstrates how RAG-based systems can transform information access in academic environments by offering a centralized, intelligent platform tailored to users' needs.
Keywords
Retrieval-Augmented Generation, Semantic Search, Vespa.ai, OpenAI GPT, FastAPI, React.js, MongoDB, University Information System, Natural Language Processing