localRAG/README.md

# Local RAG Setup

Minimal RAG implementation with LangChain, FAISS, and either **Ollama** (local) or **OpenAI** (API key). A web chat UI is included.

---

## What you need (before you start)

- **Python 3.10 or newer** ([python.org](https://www.python.org/downloads/))
- **Git** (optional, only if you clone the project)
- Either:
  - **Ollama** installed and running ([ollama.com](https://ollama.com)), with at least one model pulled, **or**
  - An **OpenAI API key** (if you use OpenAI in the chat)

---

## Install dependencies (step by step)

Install [Miniconda](https://docs.conda.io/en/latest/miniconda.html) or [Anaconda](https://www.anaconda.com/) if you do not have Conda yet.

All commands below assume your terminal is open **in the project folder** (the folder that contains `requirements.txt`).

```bash
conda create -n local_rag python=3.10 -y
conda activate local_rag
pip install --upgrade pip
pip install -r requirements.txt
```

Use `conda activate local_rag` in every new terminal session before running `python` or `uvicorn` for this project.

### OpenAI (only if you use the OpenAI provider in the chat)

In the same terminal **before** starting the server:

```bash
export OPENAI_API_KEY="your-key-here"
```

On Windows (Command Prompt): `set OPENAI_API_KEY=your-key-here`

---

## Run Ollama (only if you use Ollama)

In a **separate** terminal:

```bash
ollama serve
```

In another terminal, pull a model once (example):

```bash
ollama pull gpt-oss:20b
```

The model name must match what you configure in `server.py` (see [Configuration reference](#configuration-reference)).

---

## Build the vector store from a folder of PDFs

The project includes [add_pdfs.py](add_pdfs.py). It finds every **`.pdf`** file under a folder you choose (including subfolders), then chunks, embeds, and saves to FAISS.

**Two modes** (set in the script):

| Setting | Behavior |
|---------|----------|
| `CLEAR_VECTORSTORE_FIRST = True` | Deletes the existing vector store folder, then builds a **new** index from the PDFs under `DATA_ROOT`. Use this for a full rebuild. |
| `CLEAR_VECTORSTORE_FIRST = False` | Keeps the current index (if it exists) and **merges** chunks from the PDFs under `DATA_ROOT` into it. Use this to add another batch of PDFs without wiping what you already indexed. |

**Steps:**

1. Open [add_pdfs.py](add_pdfs.py) in a text editor.
2. Set **`DATA_ROOT`** to the folder that contains your PDFs (absolute path or path relative to how you run the script).
3. Set **`CLEAR_VECTORSTORE_FIRST`** to `True` (fresh index) or `False` (append to existing store).
4. Optionally set **`VECTORSTORE_PATH`** (default: `./vectorstore`). It must match **`VECTORSTORE_PATH`** in [server.py](server.py) so the chat loads the same index.
5. From the project folder, with `conda activate local_rag` (or your chosen env name):

```bash
python add_pdfs.py
```

Indexing can take a long time for many large PDFs. When it finishes, you should see `Vector store saved to ...`.

**Note:** This script only indexes **PDF** files. To add `.txt` or `.md` files, use the Python snippet below or call `add_documents` yourself.

---

## Add more documents later (alternative to add_pdfs)

You can also merge files by hand with a short script (any mix of supported types):

```python
from local_rag import LocalRAG

rag = LocalRAG(vectorstore_path="./vectorstore")  # same path as server.py
rag.add_documents([
    "path/to/new1.pdf",
    "path/to/notes.txt",
])
```

`add_documents` merges new chunks into the existing FAISS store and saves it again—the same behavior as [add_pdfs.py](add_pdfs.py) with `CLEAR_VECTORSTORE_FIRST = False`.

---

## Swap or experiment with different vector stores

The vector index is stored on disk under the folder given by **`VECTORSTORE_PATH`** (default `./vectorstore`). That folder contains files such as `index.faiss` and `index.pkl`.

**To use a different index:**

1. Set **`VECTORSTORE_PATH`** in both [server.py](server.py) and any script you use to build the index (e.g. [add_pdfs.py](add_pdfs.py)) to the **same** path, e.g. `./vectorstore_experiment`.
2. Rebuild the index (run `add_pdfs.py` or `add_documents`) so that folder is created.
3. **Restart** the web server so it loads the new path at startup.

**Tips:**

- Keep multiple copies of the folder (e.g. `vectorstore_backup`, `vectorstore_papers_only`) and swap `VECTORSTORE_PATH` to switch between them.
- If you change **chunk size**, **embedding model**, or **FAISS** usage in code, treat the old index as incompatible: use a new `VECTORSTORE_PATH` or delete the old folder and rebuild.

---

## Run the chat web app

With the Conda environment activated (`conda activate local_rag`) and (if needed) `OPENAI_API_KEY` set:

```bash
uvicorn server:app --reload
```

Open [http://127.0.0.1:8000](http://127.0.0.1:8000) or [http://localhost:8000](http://localhost:8000).

- Use the **LLM provider** dropdown: **Ollama** or **OpenAI** (OpenAI only works if the server was started with a valid `OPENAI_API_KEY`).
- You need a **non-empty vector store** (see above) for answers to work.

---

## API (short reference)

| Endpoint | Purpose |
|----------|---------|
| `POST /api/chat` | Body: `message`, optional `history`, optional `llm_provider` (`ollama` or `openai`) |
| `GET /api/providers` | Which providers are available (`openai` false if no API key at startup) |
| `GET /api/health` | Server and whether a vector store is loaded |

---

## How it works (high level)

1. **Load documents** – PDFs via `PyPDFLoader`, text via `TextLoader`.
2. **Chunk** – `RecursiveCharacterTextSplitter` (defaults in [local_rag.py](local_rag.py)).
3. **Embed** – Hugging Face `sentence-transformers/all-MiniLM-L6-v2`.
4. **Store** – FAISS; retrieval uses `similarity_search_with_score`.
5. **Query** – Optional rephrase with chat history, retrieval, then answer from the LLM.

---

## Configuration reference (what to edit)

These are the main places to change behavior without restructuring the app.

### [server.py](server.py)

| What | Where |
|------|--------|
| Ollama model name | `OLLAMA_MODEL = "..."` |
| OpenAI model name | `OPENAI_MODEL = "..."` |
| Where the FAISS index is loaded from | `VECTORSTORE_PATH = "./vectorstore"` (must match your indexing script) |

### [local_rag.py](local_rag.py) – `LocalRAG.__init__`

| What | Where (approx.) |
|------|------------------|
| Default vector store folder | Parameter `vectorstore_path="./vectorstore"` |
| Embedding model | `HuggingFaceEmbeddings(model_name="sentence-transformers/...")` |
| Chunk size and overlap | Module-level `CHUNK_SIZE` and `CHUNK_OVERLAP` (used by `RecursiveCharacterTextSplitter` when adding documents) |
| Default Ollama / OpenAI model strings | Parameters `ollama_model`, `openai_model`, `ollama_base_url` |

Changing the embedding model or chunk settings requires **rebuilding** the vector store (old index is not compatible).

### [local_rag.py](local_rag.py) – `query_with_history`

| What | Where |
|------|--------|
| Default number of chunks retrieved (`k`) | Module-level `RETRIEVAL_K` (overrides: pass `k=` to `query` / `query_with_history`) |
| Extra text appended only to the **FAISS query** (biases retrieval, not the final answer phrasing) | `QUERY_ADDITIONAL_INSTRUCTIONS` (concatenated to the search query before embedding) |
| **Rephrase** prompt (standalone question when there is chat history) | String `rephrase_prompt = f"""..."""` inside `query_with_history` |
| **Answer** prompt – opening instructions only | Module-level `ANSWER_PROMPT` (edit the role / style lines). The block from chat history through `Answer:` is built in `query_with_history` |

### [add_pdfs.py](add_pdfs.py)

| What | Where |
|------|--------|
| Folder to scan for PDFs | `DATA_ROOT = Path("...")` |
| Output vector store folder | `VECTORSTORE_PATH = "./vectorstore"` (keep in sync with `server.py`) |
| Wipe index vs merge | `CLEAR_VECTORSTORE_FIRST = True` (delete and rebuild) or `False` (append to existing index) |

---

## Dependencies (for developers)

See [requirements.txt](requirements.txt) for the full list (LangChain, FAISS, sentence-transformers, FastAPI, uvicorn, etc.).