Implement on-prem RAG with Qdrant and Ollama for a self-hosted KB

Try It This n8n template provides a self hosted RAG implementation.

How it works Provides one workflow to maintain the knowledge base and another one to query the knowledge base. Uploaded documents are saved into the Qdrant vector store. When a query is made, the most relevant documents are retrieved from the vector store and sent to the LLM as context for generating a response.

How to use Start the workflow by clicking Execute workflow Use the file upload form to upload a document into the knowledge base (Qdrant db). Click Open chat to start asking questions related to the uploaded documents.

Setup steps Below steps show how to setup on Amazon Linux. Consult your OS for respective steps

Install Ollama on prem mkdir ollama cd ollama curl -fsSL https://ollama.com/install.sh | sh ollama --version Install required models ( in Amazon Linux)

ollama pull llama3:8b ollama pull mistral:7b ollama pull nomic-embed-text:latest Access ollama via http://localhost:11434 Fire up Qdrant (e.g. via docker) docker run -p 6333:6333 qdrant/qdrant. Access Qdrant via http://localhost:6333/dashboard Create a Qdrant collection named knowledge-base configured with vector length of 768. NB: Do not forget a persistent docker volume for Qdrant if you want to keep the data when using docker. Point the nodes to the respective on premise Qdrant and Ollama runtimes.

Need Help? Join the Discord or ask in the Forum!

Happy RAGing!

0
Downloads
0
Views
8.06
Quality Score
intermediate
Complexity
Author:Mabura Ze Guru(View Original →)
Created:2/21/2026
Updated:4/12/2026

🔒 Please log in to import templates to n8n and favorite templates

Workflow Visualization

Loading...

Preparing workflow renderer

Comments (0)

Login to post comments