Generate consensus-based answers using Claude, GPT, Grok and Gemini
The original LLM Council concept was introduced by Andrej Karpathy and published as an open-source repository demonstrating multi-model consensus and ranking. This workflow is my adaptation of that original idea, reimplemented and structured as a production-ready n8n template. Original repository - https://github.com/karpathy/llm-council
This n8n template implements the LLM Council pattern: a single user question is processed in parallel by multiple large language models, independently evaluated by peer models, and then synthesized into one high-quality, consensus-driven final answer. It is designed for use cases where answer quality, balance, and reduced single-model bias are critical.
๐ Section 1: Trigger & Input
โก When Chat Message Received (Chat Trigger) Purpose: Receives a userโs message and initiates the entire workflow.
How it works:
A user sends a chat message
The message is stored as the Original Question
The same input is forwarded simultaneously to multiple LLM pipelines
Why it matters: Provides a clean, unified entry point for all downstream multi-model logic.
๐ Section 2: Stage 1 โ Parallel LLM Responses
๐ค Basic LLM Chains (x4) Models used:
Anthropic Claude
OpenAI GPT
xAI Grok
Google Gemini
Purpose: Each model independently generates its own response to the same question.
Key characteristics:
Identical prompt structure for all models
Independent reasoning paths
No shared context between models
Why it matters: Produces diverse perspectives, reasoning styles, and solution approaches.
๐ Section 3: Stage 2 โ Response Anonymization
๐งพ Set Nodes (Response A / B / C / D) Purpose: Stores model outputs in an anonymized format:
Response A
Response B
Response C
Response D
Why it matters: Prevents evaluator models from knowing which LLM authored which response, reducing bias during evaluation.
๐ Section 4: Stage 3 โ Peer Evaluation & Ranking
๐ Evaluation Chains (Claude / GPT / Grok / Gemini) Purpose: Each model acts as a reviewer and:
Analyzes all four anonymized responses
Describes strengths and weaknesses of each
Produces a strict FINAL RANKING from best to worst
Ranking format (strict):
FINAL RANKING: Response B Response A Response D Response C
Why it matters: Creates multiple independent quality assessments from different model perspectives.
๐ Section 5: Stage 4 โ Ranking Aggregation
๐งฎ Code Node (JavaScript) Purpose: Aggregates all peer rankings by:
Parsing ranking positions
Calculating average position per response
Counting evaluation occurrences
Sorting responses by best average score
Output includes:
Aggregated rankings
Best response label
Best average score
Why it matters: Transforms subjective rankings into a structured, quantitative consensus.
๐ Section 6: Stage 5 โ Final Consensus Answer
๐ง Chairman LLM Chain Purpose: One model acts as the Council Chairman and:
Reviews all original responses
Considers peer rankings and aggregated scores
Identifies consensus patterns and disagreements
Produces a single, clear, high-quality final answer
Why it matters: Delivers a refined response that reflects collective model intelligence rather than a simple average.
๐ Workflow Overview Stage Node / Logic Purpose 1 Chat Trigger Receive user question 2 LLM Chains Generate independent responses 3 Set Nodes Anonymize outputs 4 Evaluation Chains Peer review & ranking 5 Code Node Aggregate rankings 6 Chairman LLM Final synthesized answer ๐ฏ Key Benefits
๐ง Multi-model intelligence โ avoids reliance on a single LLM โ๏ธ Reduced bias โ anonymized peer evaluation ๐ Quality-driven selection โ ranking-based consensus ๐ Modular architecture โ easy to add or replace models ๐ Language-flexible โ input and output languages configurable ๐งฉ Production-ready logic โ clear stages, deterministic ranking
๐ Ideal Use Cases
High-stakes decision support
Complex technical or architectural questions
Strategy and research synthesis
AI assistants requiring higher trust and reliability
Comparing and selecting the best LLM-generated answers
Related Templates
AI SEO Readability Audit: Check Website Friendliness for LLMs
Who is this for? This workflow is designed for SEO specialists, content creators, marketers, and website developers who ...
Use OpenRouter in n8n versions <1.78
What it is: In version 1.78, n8n introduced a dedicated node to use the OpenRouter service, which lets you to use a lot...
Task Deadline Reminders with Google Sheets, ChatGPT, and Gmail
Intro This template is for project managers, team leads, or anyone who wants to automatically remind teammates of tasks ...
๐ Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments