Build an OpenAI RAG system with document upload, semantic search and caching
Overview This workflow implements a complete Retrieval-Augmented Generation (RAG) system for document ingestion and intelligent querying.
It allows users to upload documents, convert them into vector embeddings, and query them using natural language. The system retrieves relevant document context and generates accurate AI responses while using caching to improve performance and reduce costs.
This workflow is ideal for building AI knowledge bases, document assistants, and internal search systems.
How It Works
- Input & Configuration Receives requests via webhook (rag-system) Supports two actions: upload → process documents query → answer questions Defines: Chunk size & overlap TopK retrieval count Database table names
Document Upload Flow
Text Extraction Extracts text from uploaded PDF documents
Text Chunking Splits text into overlapping chunks for better retrieval accuracy
Document Structuring Converts chunks into structured documents
Embedding Generation Generates vector embeddings using OpenAI
Vector Storage Stores embeddings in PGVector (Postgres)
Upload Logging Logs document metadata (user, filename, timestamp)
Response Returns success message via webhook
Query Flow
Cache Check Checks if query result exists in cache (last 1 hour)
Cache Routing
If cached → return cached response
If not → proceed to retrieval
Cache Hit Flow
Format Cached Response Standardizes cached output format
Respond to User Returns cached answer with cached: true
Cache Miss Flow
Vector Retrieval Retrieves top relevant document chunks from PGVector
AI Answer Generation Uses LLM with retrieved context Generates accurate, context-based answer
Cache Storage Saves query + response in database for reuse
Response Returns generated answer with cached: false
Setup Instructions
Webhook Setup Configure endpoint (rag-system) Send payload with: action: upload / query user_id document or query
OpenAI Setup Add API credentials for: Embeddings Chat model
Postgres + PGVector Enable PGVector extension Create tables: documents query_cache upload_log
Configure Parameters Adjust: Chunk size (e.g., 1000) Overlap (e.g., 200) TopK (e.g., 5)
Optional Enhancements Add authentication layer Add multi-tenant filtering (user_id)
Use Cases
AI document search systems
Internal knowledge base assistants
Customer support knowledge retrieval
Legal or compliance document analysis
SaaS AI chat with custom data
Requirements
OpenAI API key
Postgres database with PGVector
n8n instance (cloud or self-hosted)
Key Features
Full RAG architecture (upload + query)
PDF document ingestion pipeline
Semantic search with vector embeddings
Context-aware AI responses
Query caching for performance optimization
Multi-user support via metadata filtering
Scalable and modular design
Summary
A complete RAG-based AI system that enables document ingestion, semantic search, and intelligent query answering. It combines vector databases, LLMs, and caching to deliver fast, accurate, and scalable AI-powered knowledge retrieval.
Related Templates
Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search
Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...
USDT And TRC20 Wallet Tracker API Workflow for n8n
Overview This n8n workflow is specifically designed to monitor USDT TRC20 transactions within a specified wallet. It u...
Add product ideas to Google Sheets via a Slack
Use Case This workflow is a slight variation of a workflow we're using at n8n. In most companies, employees have a lot o...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments