Philipp Mock 9abda1d867 Add LLM provider switch, markdown chat UI, and update README

- Dual RAG instances (Ollama + OpenAI) for on-the-fly provider switching
- Provider dropdown in chat UI, /api/providers endpoint
- Markdown rendering for assistant responses
- Server logs include provider and model name for each LLM response
- README: OpenAI setup, add_pdfs, API docs, provider switch

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-02-11 16:09:42 +01:00

2.4 KiB

Raw Blame History

Local RAG Setup

Minimal RAG implementation with LangChain, FAISS, and support for either Ollama or OpenAI (API-key needed).

Dependencies

langchain - Core framework
langchain-community - Loaders, vectorstores
langchain-ollama - Ollama integration
langchain-openai - OpenAI integration
langchain-text-splitters - Text splitting
langchain-huggingface - HuggingFace embeddings
faiss-cpu - Vector search
sentence-transformers - Embeddings
pypdf - PDF loading
fastapi - Web server
uvicorn - ASGI server

Installation

conda create -n local_rag python=3.10 -y
conda activate local_rag
pip install -r requirements.txt

Setup

Ollama (optional)

ollama serve
ollama pull mistral

OpenAI (optional)

Set the API key when using OpenAI:

export OPENAI_API_KEY="your-key"

Add Documents

Option 1: Add PDFs from a folder via script. Edit DATA_ROOT in add_pdfs.py to point at your folder, then run:

python add_pdfs.py

The script clears the existing vector store and indexes all PDFs recursively. Supports .pdf, .txt, .md.

Option 2: Use local_rag.py programmatically:

from local_rag import LocalRAG
rag = LocalRAG()
rag.add_documents(["path/to/doc1.pdf", "path/to/doc2.txt"])

Chat GUI

Start the server:

uvicorn server:app --reload

Open http://localhost:8000. The chat UI provides:

Provider switch – Toggle between Ollama and OpenAI without restart (OpenAI requires OPENAI_API_KEY)
Conversation history – Multi-turn chat with context
Markdown – Assistant replies rendered as markdown (headings, code, lists, links)

Ensure the vector store is populated and at least one provider (Ollama or OpenAI) is configured.

API

POST /api/chat – { "message": "...", "history": [...], "llm_provider": "ollama"|"openai" }
GET /api/providers – { "ollama": true, "openai": true|false }
GET /api/health – Health and vectorstore status

How it works

Load documents – PDFs or text via PyPDFLoader / TextLoader
Chunk – RecursiveCharacterTextSplitter (2000 chars, 400 overlap)
Embed – sentence-transformers/all-MiniLM-L6-v2
Store – FAISS vector store (similarity search with scores)
Query – Retrieve chunks, optionally rephrase with conversation history, generate answer with selected LLM

2.4 KiB Raw Blame History Unescape Escape