Generate consensus-based answers using Claude, GPT, Grok and Gemini
The original LLM Council concept was introduced by Andrej Karpathy and published as an open-source repository demonstrating multi-model consensus and ranking. This workflow is my adaptation of that original idea, reimplemented and structured as a production-ready n8n template. Original repository - https://github.com/karpathy/llm-council
This n8n template implements the LLM Council pattern: a single user question is processed in parallel by multiple large language models, independently evaluated by peer models, and then synthesized into one high-quality, consensus-driven final answer. It is designed for use cases where answer quality, balance, and reduced single-model bias are critical.
๐ Section 1: Trigger & Input
โก When Chat Message Received (Chat Trigger) Purpose: Receives a userโs message and initiates the entire workflow.
How it works:
A user sends a chat message
The message is stored as the Original Question
The same input is forwarded simultaneously to multiple LLM pipelines
Why it matters: Provides a clean, unified entry point for all downstream multi-model logic.
๐ Section 2: Stage 1 โ Parallel LLM Responses
๐ค Basic LLM Chains (x4) Models used:
Anthropic Claude
OpenAI GPT
xAI Grok
Google Gemini
Purpose: Each model independently generates its own response to the same question.
Key characteristics:
Identical prompt structure for all models
Independent reasoning paths
No shared context between models
Why it matters: Produces diverse perspectives, reasoning styles, and solution approaches.
๐ Section 3: Stage 2 โ Response Anonymization
๐งพ Set Nodes (Response A / B / C / D) Purpose: Stores model outputs in an anonymized format:
Response A
Response B
Response C
Response D
Why it matters: Prevents evaluator models from knowing which LLM authored which response, reducing bias during evaluation.
๐ Section 4: Stage 3 โ Peer Evaluation & Ranking
๐ Evaluation Chains (Claude / GPT / Grok / Gemini) Purpose: Each model acts as a reviewer and:
Analyzes all four anonymized responses
Describes strengths and weaknesses of each
Produces a strict FINAL RANKING from best to worst
Ranking format (strict):
FINAL RANKING: Response B Response A Response D Response C
Why it matters: Creates multiple independent quality assessments from different model perspectives.
๐ Section 5: Stage 4 โ Ranking Aggregation
๐งฎ Code Node (JavaScript) Purpose: Aggregates all peer rankings by:
Parsing ranking positions
Calculating average position per response
Counting evaluation occurrences
Sorting responses by best average score
Output includes:
Aggregated rankings
Best response label
Best average score
Why it matters: Transforms subjective rankings into a structured, quantitative consensus.
๐ Section 6: Stage 5 โ Final Consensus Answer
๐ง Chairman LLM Chain Purpose: One model acts as the Council Chairman and:
Reviews all original responses
Considers peer rankings and aggregated scores
Identifies consensus patterns and disagreements
Produces a single, clear, high-quality final answer
Why it matters: Delivers a refined response that reflects collective model intelligence rather than a simple average.
๐ Workflow Overview Stage Node / Logic Purpose 1 Chat Trigger Receive user question 2 LLM Chains Generate independent responses 3 Set Nodes Anonymize outputs 4 Evaluation Chains Peer review & ranking 5 Code Node Aggregate rankings 6 Chairman LLM Final synthesized answer ๐ฏ Key Benefits
๐ง Multi-model intelligence โ avoids reliance on a single LLM โ๏ธ Reduced bias โ anonymized peer evaluation ๐ Quality-driven selection โ ranking-based consensus ๐ Modular architecture โ easy to add or replace models ๐ Language-flexible โ input and output languages configurable ๐งฉ Production-ready logic โ clear stages, deterministic ranking
๐ Ideal Use Cases
High-stakes decision support
Complex technical or architectural questions
Strategy and research synthesis
AI assistants requiring higher trust and reliability
Comparing and selecting the best LLM-generated answers
Related Templates
AI SEO Readability Audit: Check Website Friendliness for LLMs
Who is this for? This workflow is designed for SEO specialists, content creators, marketers, and website developers who ...
Reply to Outlook Emails with OpenAI
Who is this template for? This template is for any Microsoft Outlook user who wants a trained AI agent to reason and rep...
Use OpenRouter in n8n versions <1.78
What it is: In version 1.78, n8n introduced a dedicated node to use the OpenRouter service, which lets you to use a lot...
๐ Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments