# Local RAG Setup

Minimal RAG implementation with LangChain, FAISS, and support for either Ollama or OpenAI (API-key needed).

## Dependencies

- `langchain` - Core framework
- `langchain-community` - Loaders, vectorstores
- `langchain-ollama` - Ollama integration
- `langchain-openai` - OpenAI integration
- `langchain-text-splitters` - Text splitting
- `langchain-huggingface` - HuggingFace embeddings
- `faiss-cpu` - Vector search
- `sentence-transformers` - Embeddings
- `pypdf` - PDF loading
- `fastapi` - Web server
- `uvicorn` - ASGI server

## Installation

```bash
conda create -n local_rag python=3.10 -y
conda activate local_rag
pip install -r requirements.txt
```

## Setup

### Ollama (optional)

```bash
ollama serve
ollama pull mistral
```

### OpenAI (optional)

Set the API key when using OpenAI:

```bash
export OPENAI_API_KEY="your-key"
```

## Add Documents

**Option 1:** Add PDFs from a folder via script. Edit `DATA_ROOT` in [add_pdfs.py](add_pdfs.py) to point at your folder, then run:

```bash
python add_pdfs.py
```

The script clears the existing vector store and indexes all PDFs recursively. Supports `.pdf`, `.txt`, `.md`.

**Option 2:** Use `local_rag.py` programmatically:

```python
from local_rag import LocalRAG
rag = LocalRAG()
rag.add_documents(["path/to/doc1.pdf", "path/to/doc2.txt"])
```

## Chat GUI

Start the server:

```bash
uvicorn server:app --reload
```

Open [http://localhost:8000](http://localhost:8000). The chat UI provides:

- **Provider switch** – Toggle between Ollama and OpenAI without restart (OpenAI requires `OPENAI_API_KEY`)
- **Conversation history** – Multi-turn chat with context
- **Markdown** – Assistant replies rendered as markdown (headings, code, lists, links)

Ensure the vector store is populated and at least one provider (Ollama or OpenAI) is configured.

## API

- `POST /api/chat` – `{ "message": "...", "history": [...], "llm_provider": "ollama"|"openai" }`
- `GET /api/providers` – `{ "ollama": true, "openai": true|false }`
- `GET /api/health` – Health and vectorstore status

## How it works

1. **Load documents** – PDFs or text via PyPDFLoader / TextLoader
2. **Chunk** – RecursiveCharacterTextSplitter (2000 chars, 400 overlap)
3. **Embed** – sentence-transformers/all-MiniLM-L6-v2
4. **Store** – FAISS vector store (similarity search with scores)
5. **Query** – Retrieve chunks, optionally rephrase with conversation history, generate answer with selected LLM