- Dual RAG instances (Ollama + OpenAI) for on-the-fly provider switching - Provider dropdown in chat UI, /api/providers endpoint - Markdown rendering for assistant responses - Server logs include provider and model name for each LLM response - README: OpenAI setup, add_pdfs, API docs, provider switch Co-authored-by: Cursor <cursoragent@cursor.com>
91 lines
2.4 KiB
Markdown
91 lines
2.4 KiB
Markdown
# Local RAG Setup
|
||
|
||
Minimal RAG implementation with LangChain, FAISS, and support for either Ollama or OpenAI (API-key needed).
|
||
|
||
## Dependencies
|
||
|
||
- `langchain` - Core framework
|
||
- `langchain-community` - Loaders, vectorstores
|
||
- `langchain-ollama` - Ollama integration
|
||
- `langchain-openai` - OpenAI integration
|
||
- `langchain-text-splitters` - Text splitting
|
||
- `langchain-huggingface` - HuggingFace embeddings
|
||
- `faiss-cpu` - Vector search
|
||
- `sentence-transformers` - Embeddings
|
||
- `pypdf` - PDF loading
|
||
- `fastapi` - Web server
|
||
- `uvicorn` - ASGI server
|
||
|
||
## Installation
|
||
|
||
```bash
|
||
conda create -n local_rag python=3.10 -y
|
||
conda activate local_rag
|
||
pip install -r requirements.txt
|
||
```
|
||
|
||
## Setup
|
||
|
||
### Ollama (optional)
|
||
|
||
```bash
|
||
ollama serve
|
||
ollama pull mistral
|
||
```
|
||
|
||
### OpenAI (optional)
|
||
|
||
Set the API key when using OpenAI:
|
||
|
||
```bash
|
||
export OPENAI_API_KEY="your-key"
|
||
```
|
||
|
||
## Add Documents
|
||
|
||
**Option 1:** Add PDFs from a folder via script. Edit `DATA_ROOT` in [add_pdfs.py](add_pdfs.py) to point at your folder, then run:
|
||
|
||
```bash
|
||
python add_pdfs.py
|
||
```
|
||
|
||
The script clears the existing vector store and indexes all PDFs recursively. Supports `.pdf`, `.txt`, `.md`.
|
||
|
||
**Option 2:** Use `local_rag.py` programmatically:
|
||
|
||
```python
|
||
from local_rag import LocalRAG
|
||
rag = LocalRAG()
|
||
rag.add_documents(["path/to/doc1.pdf", "path/to/doc2.txt"])
|
||
```
|
||
|
||
## Chat GUI
|
||
|
||
Start the server:
|
||
|
||
```bash
|
||
uvicorn server:app --reload
|
||
```
|
||
|
||
Open [http://localhost:8000](http://localhost:8000). The chat UI provides:
|
||
|
||
- **Provider switch** – Toggle between Ollama and OpenAI without restart (OpenAI requires `OPENAI_API_KEY`)
|
||
- **Conversation history** – Multi-turn chat with context
|
||
- **Markdown** – Assistant replies rendered as markdown (headings, code, lists, links)
|
||
|
||
Ensure the vector store is populated and at least one provider (Ollama or OpenAI) is configured.
|
||
|
||
## API
|
||
|
||
- `POST /api/chat` – `{ "message": "...", "history": [...], "llm_provider": "ollama"|"openai" }`
|
||
- `GET /api/providers` – `{ "ollama": true, "openai": true|false }`
|
||
- `GET /api/health` – Health and vectorstore status
|
||
|
||
## How it works
|
||
|
||
1. **Load documents** – PDFs or text via PyPDFLoader / TextLoader
|
||
2. **Chunk** – RecursiveCharacterTextSplitter (2000 chars, 400 overlap)
|
||
3. **Embed** – sentence-transformers/all-MiniLM-L6-v2
|
||
4. **Store** – FAISS vector store (similarity search with scores)
|
||
5. **Query** – Retrieve chunks, optionally rephrase with conversation history, generate answer with selected LLM
|