Route AI queries cost‑efficiently with GPT‑4o‑mini, GPT‑4o and confidence scoring
This workflow implements a cost-optimized AI routing system using n8n. It intelligently decides whether a request should be handled by a low-cost model or escalated to a higher-quality model based on response confidence.
The goal is to minimize LLM usage costs while maintaining high answer quality.
A query is first processed by a cheaper model. The response is then evaluated by a confidence-scoring AI agent. If the response quality is insufficient, the workflow automatically escalates the request to a more capable model.
This approach is useful for building scalable AI systems where most queries can be answered cheaply, while complex queries still receive high-quality responses.
How It Works
Webhook Trigger Receives a user query from an external application.
Workflow Configuration Defines parameters such as: confidence threshold cheap model cost expensive model cost
Cheap Model Response The query is first processed using GPT-4o-mini to minimize cost.
Confidence Evaluation An AI agent analyzes the response quality. It evaluates accuracy, completeness, clarity, and relevance.
Structured Output Parsing The evaluator returns structured data including: confidence score explanation escalation recommendation.
Decision Logic If the confidence score is below the configured threshold, the workflow escalates the request.
Expensive Model Escalation The query is reprocessed using GPT-4o for a higher-quality answer.
Cost Calculation Token usage is analyzed to estimate: total cost cost difference between models.
Final Response Formatting The workflow returns: AI response model used confidence score escalation status estimated cost.
Setup Instructions
Create an OpenAI credential in n8n.
Configure the following nodes: Cheap Model (GPT-4o-mini) Expensive Model (GPT-4o) OpenAI Chat Model used by the confidence evaluator agent.
Adjust configuration values in the Workflow Configuration node: confidenceThreshold cheapModelCostPer1kTokens expensiveModelCostPer1kTokens
Deploy the workflow and send requests to the Webhook URL.
Example webhook payload:
{ "query": "Explain how photosynthesis works." }
Related Templates
AI SEO Readability Audit: Check Website Friendliness for LLMs
Who is this for? This workflow is designed for SEO specialists, content creators, marketers, and website developers who ...
Use OpenRouter in n8n versions <1.78
What it is: In version 1.78, n8n introduced a dedicated node to use the OpenRouter service, which lets you to use a lot...
Task Deadline Reminders with Google Sheets, ChatGPT, and Gmail
Intro This template is for project managers, team leads, or anyone who wants to automatically remind teammates of tasks ...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments