localRAG/README.md
Philipp Mock 9abda1d867 Add LLM provider switch, markdown chat UI, and update README
- Dual RAG instances (Ollama + OpenAI) for on-the-fly provider switching
- Provider dropdown in chat UI, /api/providers endpoint
- Markdown rendering for assistant responses
- Server logs include provider and model name for each LLM response
- README: OpenAI setup, add_pdfs, API docs, provider switch

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-11 16:09:42 +01:00

2.4 KiB
Raw Blame History

Local RAG Setup

Minimal RAG implementation with LangChain, FAISS, and support for either Ollama or OpenAI (API-key needed).

Dependencies

  • langchain - Core framework
  • langchain-community - Loaders, vectorstores
  • langchain-ollama - Ollama integration
  • langchain-openai - OpenAI integration
  • langchain-text-splitters - Text splitting
  • langchain-huggingface - HuggingFace embeddings
  • faiss-cpu - Vector search
  • sentence-transformers - Embeddings
  • pypdf - PDF loading
  • fastapi - Web server
  • uvicorn - ASGI server

Installation

conda create -n local_rag python=3.10 -y
conda activate local_rag
pip install -r requirements.txt

Setup

Ollama (optional)

ollama serve
ollama pull mistral

OpenAI (optional)

Set the API key when using OpenAI:

export OPENAI_API_KEY="your-key"

Add Documents

Option 1: Add PDFs from a folder via script. Edit DATA_ROOT in add_pdfs.py to point at your folder, then run:

python add_pdfs.py

The script clears the existing vector store and indexes all PDFs recursively. Supports .pdf, .txt, .md.

Option 2: Use local_rag.py programmatically:

from local_rag import LocalRAG
rag = LocalRAG()
rag.add_documents(["path/to/doc1.pdf", "path/to/doc2.txt"])

Chat GUI

Start the server:

uvicorn server:app --reload

Open http://localhost:8000. The chat UI provides:

  • Provider switch Toggle between Ollama and OpenAI without restart (OpenAI requires OPENAI_API_KEY)
  • Conversation history Multi-turn chat with context
  • Markdown Assistant replies rendered as markdown (headings, code, lists, links)

Ensure the vector store is populated and at least one provider (Ollama or OpenAI) is configured.

API

  • POST /api/chat { "message": "...", "history": [...], "llm_provider": "ollama"|"openai" }
  • GET /api/providers { "ollama": true, "openai": true|false }
  • GET /api/health Health and vectorstore status

How it works

  1. Load documents PDFs or text via PyPDFLoader / TextLoader
  2. Chunk RecursiveCharacterTextSplitter (2000 chars, 400 overlap)
  3. Embed sentence-transformers/all-MiniLM-L6-v2
  4. Store FAISS vector store (similarity search with scores)
  5. Query Retrieve chunks, optionally rephrase with conversation history, generate answer with selected LLM