Generate consensus-based answers using Claude, GPT, Grok and Gemini

Name: Generate consensus-based answers using Claude, GPT, Grok and Gemini
Availability: InStock
Rating: 0.4 (1 reviews)
Author: Yehor EGMS

The original LLM Council concept was introduced by Andrej Karpathy and published as an open-source repository demonstrating multi-model consensus and ranking. This workflow is my adaptation of that original idea, reimplemented and structured as a production-ready n8n template. Original repository - https://github.com/karpathy/llm-council

This n8n template implements the LLM Council pattern: a single user question is processed in parallel by multiple large language models, independently evaluated by peer models, and then synthesized into one high-quality, consensus-driven final answer. It is designed for use cases where answer quality, balance, and reduced single-model bias are critical.

📌 Section 1: Trigger & Input

⚡ When Chat Message Received (Chat Trigger) Purpose: Receives a user’s message and initiates the entire workflow.

How it works:

A user sends a chat message

The message is stored as the Original Question

The same input is forwarded simultaneously to multiple LLM pipelines

Why it matters: Provides a clean, unified entry point for all downstream multi-model logic.

📌 Section 2: Stage 1 — Parallel LLM Responses

🤖 Basic LLM Chains (x4) Models used:

Anthropic Claude

OpenAI GPT

xAI Grok

Google Gemini

Purpose: Each model independently generates its own response to the same question.

Key characteristics:

Identical prompt structure for all models

Independent reasoning paths

No shared context between models

Why it matters: Produces diverse perspectives, reasoning styles, and solution approaches.

📌 Section 3: Stage 2 — Response Anonymization

🧾 Set Nodes (Response A / B / C / D) Purpose: Stores model outputs in an anonymized format:

Response A

Response B

Response C

Response D

Why it matters: Prevents evaluator models from knowing which LLM authored which response, reducing bias during evaluation.

📌 Section 4: Stage 3 — Peer Evaluation & Ranking

📊 Evaluation Chains (Claude / GPT / Grok / Gemini) Purpose: Each model acts as a reviewer and:

Analyzes all four anonymized responses

Describes strengths and weaknesses of each

Produces a strict FINAL RANKING from best to worst

Ranking format (strict):

FINAL RANKING: Response B Response A Response D Response C

Why it matters: Creates multiple independent quality assessments from different model perspectives.

📌 Section 5: Stage 4 — Ranking Aggregation

🧮 Code Node (JavaScript) Purpose: Aggregates all peer rankings by:

Parsing ranking positions

Calculating average position per response

Counting evaluation occurrences

Sorting responses by best average score

Output includes:

Aggregated rankings

Best response label

Best average score

Why it matters: Transforms subjective rankings into a structured, quantitative consensus.

📌 Section 6: Stage 5 — Final Consensus Answer

🧠 Chairman LLM Chain Purpose: One model acts as the Council Chairman and:

Reviews all original responses

Considers peer rankings and aggregated scores

Identifies consensus patterns and disagreements

Produces a single, clear, high-quality final answer

Why it matters: Delivers a refined response that reflects collective model intelligence rather than a simple average.

📊 Workflow Overview Stage Node / Logic Purpose 1 Chat Trigger Receive user question 2 LLM Chains Generate independent responses 3 Set Nodes Anonymize outputs 4 Evaluation Chains Peer review & ranking 5 Code Node Aggregate rankings 6 Chairman LLM Final synthesized answer 🎯 Key Benefits

🧠 Multi-model intelligence — avoids reliance on a single LLM ⚖️ Reduced bias — anonymized peer evaluation 📊 Quality-driven selection — ranking-based consensus 🔁 Modular architecture — easy to add or replace models 🌍 Language-flexible — input and output languages configurable 🧩 Production-ready logic — clear stages, deterministic ranking

🚀 Ideal Use Cases

High-stakes decision support

Complex technical or architectural questions

Strategy and research synthesis

AI assistants requiring higher trust and reliability

Comparing and selecting the best LLM-generated answers

0

Downloads

19

Views

8.43

Quality Score

intermediate

Complexity

Category:AI & Machine Learning

Author:Yehor EGMS(View Original →)

Created:2/13/2026

Updated:4/14/2026

Related Templates

AI SEO Readability Audit: Check Website Friendliness for LLMs

Who is this for? This workflow is designed for SEO specialists, content creators, marketers, and website developers who ...

AI & Machine Learning5 downloads

Use OpenRouter in n8n versions <1.78

What it is: In version 1.78, n8n introduced a dedicated node to use the OpenRouter service, which lets you to use a lot...

AI & Machine Learning6 downloads

Task Deadline Reminders with Google Sheets, ChatGPT, and Gmail

Intro This template is for project managers, team leads, or anyone who wants to automatically remind teammates of tasks ...

Generate consensus-based answers using Claude, GPT, Grok and Gemini

Related Templates

AI SEO Readability Audit: Check Website Friendliness for LLMs

Use OpenRouter in n8n versions <1.78

Task Deadline Reminders with Google Sheets, ChatGPT, and Gmail

Workflow Visualization

Loading...

Comments (0)