TechNova RAG Chatbot
AI customer support that actually cites its sources.
Summary
An AI-powered customer support assistant for a fictional tech company. Uses RAG to retrieve information from FAQ and policy documents stored in a vector database, shows source references in its replies, and politely declines questions outside its domain. Session memory of up to 10 messages. AI course project at top grade.
Stack
Highlights
- Modular architecture (chains / prompts / retrievers / memory / utils)
- Smart self-adjusting retriever (top-grade extra)
- Shows source references in answers - no black box
Background
AI course project at top grade. The brief: build a customer support chatbot for a fictional tech company (TechNova) that can answer questions based on the company's own documentation - not just its pre-trained knowledge.
That's where RAG (Retrieval-Augmented Generation) comes in.
How it works
- FAQ and policy documents are pre-processed and vectorized into Supabase pgvector
- When a user asks a question, the question is embedded and we retrieve the most relevant document chunks
- The retrieved context is sent along with the question to Ollama (local LLM)
- The reply renders with source references so users see what the bot based its answer on
Technical choices
Modular architecture. The code is split into chains/, prompts/, retrievers/, memory/, utils/. That makes it trivial to swap components - like retriever strategy or prompt template - without touching the rest.
Session memory of 10 messages. Not the whole history, not none. Ten messages is enough for natural conversation without blowing up prompt tokens.
Polite refusal. The bot is instructed not to guess outside its domain. If a question doesn't have strong enough retrieval matches, it explicitly says "this is outside what I can help with" instead of hallucinating.
Top-grade extra: Smart Self-Adjusting Retriever
Standard RAG retrieves k=N documents regardless. My retriever adjusts k dynamically based on question complexity and match confidence - gives more documents for complex questions and fewer when the match is high-confidence and clear. Result: less noise in the context, faster answers.
What I took away
RAG looks simple on the diagram - embed, retrieve, generate. In practice the subtle choices (chunk size, overlap, retrieval strategy, prompt format) are what determines whether the bot is useful or just mediocre.