Explore projects
-
Developed a PDF-based RAG (Retrieval-Augmented Generation) chatbot designed to answer college-related queries efficiently using document-based semantic search. Used college.pdf as the primary knowledge source containing attendance policies, semester examination details, placement eligibility, academic guidelines, and extracurricular club information. Selected PDF documents because educational institutions commonly store notices, rules, and handbooks in PDF format, making the chatbot practical for real-world applications. Implemented PDF loading using PyPDFLoader to extract textual content and convert it into readable document objects for further processing. Applied text chunking using RecursiveCharacterTextSplitter to divide large document content into smaller manageable sections for better processing. Configured chunking with chunk_size=500 and chunk_overlap=50 to improve retrieval performance and maintain contextual continuity between chunks. Used overlapping chunks to prevent context loss when important information is split across multiple sections. Improved semantic retrieval accuracy by enabling the chatbot to understand meaning rather than relying on exact keyword matching. Integrated the Hugging Face embedding model sentence-transformers/all-MiniLM-L6-v2 for generating high-quality vector embeddings. Chose the embedding model because it is lightweight, fast, beginner-friendly, and efficient for semantic search tasks. Converted textual content into vector representations, allowing similar meanings to be matched effectively during user queries. Enabled intelligent query handling where questions with different wording could still retrieve relevant answers from the document.Updated
-
Updated
-
Updated
-
Updated
-
Updated
-
Updated
-
Updated
-
Updated
-
Updated
-
Updated
-
Smart Hospital Maintenance AI Assistant is a Streamlit-based Retrieval-Augmented Generation (RAG) application designed to assist hospital maintenance teams in analyzing documents and retrieving relevant information efficiently.
The application supports PDF document analysis, OCR-based image semantic search using FAISS vector database, AI-powered question answering, 3 language translation, and voice output generation. It uses sentence-transformers embeddings for document retrieval and integrates Large Language Models to provide context-aware responses based on uploaded hospital maintenance documents.
Key Features: • PDF document upload and analysis • OCR support for image-based extraction • RAG-powered question answering • Semantic search using FAISS • 3 language translation • Voice response generation • Streamlit-based interactive user interface
Tech Stack: • Python • Streamlit • FAISS • Sentence Transformers (all-MiniLM-L6-v2) • PDFPlumber • PyTesseract OCR • Groq API • Deep Translator • gTTS
This project was developed as part of the IP1 ICFAI Internship Streamlit RAG Application assignment.
Updated -
Overview
This app ingests your uploaded PDF/text notes, builds a FAISS index with embeddings, retrieves relevant passages, and uses a HuggingFace chat model to answer your questions grounded in the uploaded sources.
Response Modes
Research Mode: Structured, evidence-grounded answers with source-aware wording Comparison Mode: Careful comparison of ancient ideas vs modern theories Simple Mode: Plain-language explanations for faster understanding Architecture
User Query → Retrieve from FAISS → HuggingFace Inference API → Answer Output
Updated -
Updated
-
Updated
-
Updated