Manish Goud requested to merge Manishhh/ip1-icfai:fix/manish into main May 30, 2026

RAG-Based Streamlit Application

Overview

This project is a Retrieval-Augmented Generation (RAG) chatbot built using Streamlit, Hugging Face embeddings, ChromaDB, and Ollama (Phi-3).

The chatbot answers questions based only on the provided document instead of relying solely on the language model's knowledge. This helps reduce hallucinations and improves the accuracy of responses.

What document did you use and why?

I used a PDF containing study notes related to the chosen topic.

For example, in this implementation I used a document containing educational content and notes relevant to the selected domain. The document was chosen because it provides structured information that can be retrieved and used to answer user queries accurately.

Using a domain-specific document ensures that the chatbot remains focused on the selected topic and reduces the chances of generating unrelated responses.

How does your chunking work?

The document is split using LangChain's RecursiveCharacterTextSplitter.

Configuration:

Chunk Size: 500 characters
Chunk Overlap: 50 characters

This approach:

Breaks large documents into smaller manageable sections.
Preserves context between chunks through overlap.
Improves retrieval accuracy by ensuring relevant information is available during search.

Which embedding model did you use?

Embedding Model:

sentence-transformers/all-MiniLM-L6-v2

Reason for selection:

Lightweight and efficient.
Produces high-quality semantic embeddings.
Widely used in RAG applications.
Works well on local systems without requiring high-end hardware.

The embedding model converts document chunks into vector representations which are stored in ChromaDB.

Technologies Used

Streamlit
LangChain
Hugging Face Embeddings
ChromaDB
Ollama
Phi-3
PyPDF
Python

Architecture

Document PDF ↓ Document Loading ↓ Text Chunking ↓ Embedding Generation ↓ ChromaDB Vector Store ↓ Retriever ↓ Phi-3 (Ollama) ↓ Answer Generation

How to Run Locally

1. Clone the repository

git clone <repository-url>
cd <repository-folder>

2. Create virtual environment

python -m venv venv

3. Activate virtual environment

Windows:

venv\Scripts\activate

Linux/Mac:

source venv/bin/activate

4. Install dependencies

pip install -r requirements.txt

5. Start Ollama

ollama run phi3

6. Run Streamlit application

streamlit run app.py

7. Open Browser

http://localhost:8501

Screenshot

Add a screenshot of:

Running Streamlit application
User query
Generated response

Example:

screenshots/app-demo.png

What would you improve with more time?

If given more time, I would:

Support multiple PDF uploads.
Add conversation memory.
Display source citations for retrieved answers.
Improve UI using Streamlit components.
Add document upload through the web interface.
Optimize retrieval using hybrid search techniques.
Deploy the application on Hugging Face Spaces or a cloud platform.
Add authentication and user management.

Conclusion

This project demonstrates a basic RAG pipeline using Streamlit, Hugging Face embeddings, ChromaDB, and Ollama. By retrieving relevant information from the document before generating responses, the chatbot provides more accurate and context-aware answers while reducing hallucinations.

Edited May 30, 2026 by Manish Goud

[submission] ManishGoud - RAG App