79 lines
1.8 KiB
Markdown
79 lines
1.8 KiB
Markdown
# Local RAG Setup
|
|
|
|
Minimal RAG implementation with LangChain, Ollama, and FAISS.
|
|
|
|
## Dependencies
|
|
|
|
- `langchain` - Core framework
|
|
- `langchain-community` - Community integrations (loaders, vectorstores)
|
|
- `langchain-ollama` - Ollama integration
|
|
- `langchain-text-splitters` - Text splitting utilities
|
|
- `langchain-huggingface` - HuggingFace embeddings
|
|
- `faiss-cpu` - Vector search
|
|
- `sentence-transformers` - Embeddings
|
|
- `pypdf` - PDF loading
|
|
- `fastapi` - Web server and API
|
|
- `uvicorn` - ASGI server
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
# Create conda environment
|
|
conda create -n local_rag python=3.10 -y
|
|
conda activate local_rag
|
|
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
## Setup Ollama
|
|
|
|
```bash
|
|
# Make sure Ollama is running
|
|
ollama serve
|
|
|
|
# Pull a model (in another terminal)
|
|
ollama pull llama2
|
|
```
|
|
|
|
## Usage
|
|
|
|
Edit `local_rag.py` and uncomment the example code:
|
|
|
|
```python
|
|
# Add documents
|
|
rag.add_documents([
|
|
"path/to/document1.pdf",
|
|
"path/to/document2.txt"
|
|
])
|
|
|
|
# Query
|
|
question = "What is this document about?"
|
|
answer = rag.query(question)
|
|
print(f"Answer: {answer}")
|
|
```
|
|
|
|
Run:
|
|
```bash
|
|
python local_rag.py
|
|
```
|
|
|
|
## Chat GUI (FastAPI)
|
|
|
|
A simple web chat interface is included. Start the server:
|
|
|
|
```bash
|
|
uvicorn server:app --reload
|
|
```
|
|
|
|
Then open [http://localhost:8000](http://localhost:8000) in your browser. The chat view uses the same RAG system: your messages are answered using the vector store and Ollama. Ensure your vector store is populated (e.g. by running the document-add steps in `local_rag.py` once) and that Ollama is running.
|
|
|
|
## How it works
|
|
|
|
1. **Load documents** - PDFs or text files
|
|
2. **Split into chunks** - 1000 chars with 200 overlap
|
|
3. **Create embeddings** - Using sentence-transformers
|
|
4. **Store in FAISS** - Fast similarity search
|
|
5. **Query** - Retrieve relevant chunks and generate answer with Ollama
|
|
|