Route AI queries cost‑efficiently with GPT‑4o‑mini, GPT‑4o and confidence scoring
This workflow implements a cost-optimized AI routing system using n8n. It intelligently decides whether a request should be handled by a low-cost model or escalated to a higher-quality model based on response confidence.
The goal is to minimize LLM usage costs while maintaining high answer quality.
A query is first processed by a cheaper model. The response is then evaluated by a confidence-scoring AI agent. If the response quality is insufficient, the workflow automatically escalates the request to a more capable model.
This approach is useful for building scalable AI systems where most queries can be answered cheaply, while complex queries still receive high-quality responses.
How It Works
Webhook Trigger Receives a user query from an external application.
Workflow Configuration Defines parameters such as: confidence threshold cheap model cost expensive model cost
Cheap Model Response The query is first processed using GPT-4o-mini to minimize cost.
Confidence Evaluation An AI agent analyzes the response quality. It evaluates accuracy, completeness, clarity, and relevance.
Structured Output Parsing The evaluator returns structured data including: confidence score explanation escalation recommendation.
Decision Logic If the confidence score is below the configured threshold, the workflow escalates the request.
Expensive Model Escalation The query is reprocessed using GPT-4o for a higher-quality answer.
Cost Calculation Token usage is analyzed to estimate: total cost cost difference between models.
Final Response Formatting The workflow returns: AI response model used confidence score escalation status estimated cost.
Setup Instructions
Create an OpenAI credential in n8n.
Configure the following nodes: Cheap Model (GPT-4o-mini) Expensive Model (GPT-4o) OpenAI Chat Model used by the confidence evaluator agent.
Adjust configuration values in the Workflow Configuration node: confidenceThreshold cheapModelCostPer1kTokens expensiveModelCostPer1kTokens
Deploy the workflow and send requests to the Webhook URL.
Example webhook payload:
{ "query": "Explain how photosynthesis works." }
Related Templates
Use OpenRouter in n8n versions <1.78
What it is: In version 1.78, n8n introduced a dedicated node to use the OpenRouter service, which lets you to use a lot...
AI SEO Readability Audit: Check Website Friendliness for LLMs
Who is this for? This workflow is designed for SEO specialists, content creators, marketers, and website developers who ...
Text automations using Apple Shortcuts
Overview This workflow answers user requests sent via Mac Shortcuts Several Shortcuts call the same webhook, with a quer...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments