Build a local RAG chatbot with Ollama, Qwen, BGE-M3 and Postgres PGVector
Build a fully local RAG chatbot using Ollama that works without tool calling — ideal for smaller open-source models like Qwen that don't support native function calls. This template lets you run a private, self-hosted AI assistant with retrieval-augmented generation using only your own hardware.
How it works
A Webhook receives the user's chat message A small classifier LLM (Qwen 7B) analyzes the input and decides: is this small talk, or a real question that needs the knowledge base? For small talk, a dedicated AI agent responds conversationally with chat memory For real questions, the classifier generates focused sub-queries, which are sent through a loop-based RAG pipeline: Each sub-query is embedded using BGE-M3 and matched against a Postgres PGVector store Results are filtered by a relevance score threshold (>0.4) Chunks are aggregated and deduplicated across all sub-queries An Answer Generator agent (Qwen 14B) produces a sourced answer using a strict 3-step format: short answer → sources → follow-up question Both paths use Postgres-backed chat memory for multi-turn conversations A post-processing step removes <think> tags that some reasoning models produce
Set up steps
Install Ollama and pull the required models: ollama pull qwen2.5:7b (classifier + small talk) ollama pull qwen3:14b (answer generation) ollama pull bge-m3 (embeddings) Set up PostgreSQL with the pgvector extension enabled Create your vector store — ingest your documents into the PGVector store using BGE-M3 embeddings (you can use n8n's built-in document loaders for this) Configure credentials in n8n: Ollama connection (default: http://localhost:11434) PostgreSQL connection for both chat memory and vector store Customize the webhook path and connect it to your frontend or API client Optional: Adjust the relevance score threshold, swap models for larger/smaller ones, or modify the system prompts to match your use case
Related Templates
Extract Named Entities from Web Pages with Google Natural Language API
Who is this for? Content strategists analyzing web page semantic content SEO professionals conducting entity-based anal...
Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search
Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...
Restore your workflows from GitHub
This workflow restores all n8n instance workflows from GitHub backups using the n8n API node. It complements the Backup ...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments