Rohit Fogla requested to merge rohit-submission into main May 29, 2026

📄 RAG Chatbot — Chat with Your Documents

A Retrieval-Augmented Generation (RAG) chatbot built with Streamlit and Groq.
Upload a PDF, DOCX, or TXT file and ask questions about it — the app finds the most relevant sections and answers using an LLM.

What document did you use and why?

I used a PDF of my college notes / a technical report as the test document.
I chose it because it's text-heavy and has specific facts that are easy to verify — making it a great way to test whether the chatbot is actually retrieving the right information rather than just guessing.

How does your chunking work?

The document text is split into overlapping fixed-size chunks using a sliding window approach:

Chunk size: 1000 characters per chunk
Overlap: 100 characters shared between consecutive chunks

The overlap ensures that sentences or ideas that fall on a chunk boundary are not lost — both neighboring chunks will contain that context.

[ chunk 1: 0 → 1000 ]
          [ chunk 2: 900 → 1900 ]
                    [ chunk 3: 1800 → 2800 ]

This is handled in utils/loader.py → split_into_chunks().

Which embedding model did you use?

I used all-MiniLM-L6-v2 from the sentence-transformers library.

Property	Detail
Model	`all-MiniLM-L6-v2`
Library	`sentence-transformers`
Embedding size	384 dimensions
Why this model?	Fast, lightweight, free, and works well for semantic search tasks

Chunks are embedded once when the document is uploaded and stored in session state. At query time, the question is embedded and compared against all chunk embeddings using cosine similarity to find the most relevant chunks.

How to run locally

1. Clone the repository

git clone https://github.com/YOUR_USERNAME/rohit-rag-app.git
cd rohit-rag-app

2. Install dependencies

pip install -r requirements.txt

3. Add your API key

Create a .env file in the root folder:

GROQ_API_KEY=your_groq_api_key_here

Get your free key at console.groq.com

4. Run the app

streamlit run app.py

Then open http://localhost:8501 in your browser.

Screenshot

What would you improve with more time?

Better chunking — use sentence-aware splitting (e.g. split on . or \n) instead of fixed character count, so chunks don't cut mid-sentence
Vector database — replace in-memory numpy arrays with a proper vector store like FAISS or ChromaDB for faster search on large documents
Multi-document support — let users upload and query across multiple files at once
Chat memory — make the LLM aware of the full conversation history for follow-up questions
Source highlighting — show the user exactly which part of the document the answer came from
Better UI — add a progress bar during embedding, and display chunk previews in an expandable section

[Submission] Rohit Fogla — RAG App