by Kumar Shivam
The AI-Powered Shopify SEO Content Automation is an enterprise-grade workflow that transforms product content creation for e-commerce stores. This sophisticated multi-agent system integrates GPT-4o, Claude Sonnet 4, Claude 3.5, Perplexity AI, and Haloscan keyword research to generate SEO-optimized product descriptions, metafields, and meta descriptions with zero manual intervention and built-in cannibalization prevention. To see the demo connect via my profile profile 💡 Key Advantages 🎯 Multi-Agent AI Orchestration Central Orchestrator manages complex workflows with specialized agents for descriptions, metafields, and SEO, each optimized for specific content types. 🔍 Advanced Keyword Research & Cannibalization Prevention Integrates Haloscan API for premium keyword discovery and SERP overlap analysis to prevent keyword cannibalization across your product catalog. 📊 Enterprise SEO Optimization Specialized for e-commerce with semantic alignment, TF-IDF optimization, and compliance with industry regulations and best practices. 🧠 Intelligent Content Strategy Perplexity AI provides market intelligence, search intent analysis, and trending keyword discovery for data-driven content decisions. 🏗️ Comprehensive Content Generation Creates product descriptions, 6 specialized metafields, SEO meta descriptions, and rich text formatting for complete Shopify integration. 📋 Automated Workflow Management Airtable integration tracks content creation status, manages keyword databases, and provides centralized workflow control. ⚙️ How It Works Content Type Selection Form-based trigger allows selection of content types: create_product_description, create_product_meta, or create_product_seo. Product Data Collection Retrieves comprehensive product information from Shopify and Airtable, including titles, descriptions, handles, and vendor details. Premium Keyword Discovery Haloscan API analyzes product titles for keyword opportunities Extracts search metrics, competitor keywords, and SERP data Perplexity provides market intelligence and search intent analysis SEO Compliance Checking Performs SERP overlap analysis to identify existing rankings Filters keywords to prevent cannibalization Updates Airtable with curated keyword lists Generates actionable SEO content strategies Multi-Agent Content Generation Product Description Agent (Claude Sonnet 4): Generates SEO-optimized product descriptions with verified facts Implements strict HTML structure with proper heading hierarchy Ensures compliance with e-commerce regulations and best practices Meta Fields Agent (Claude Sonnet 4): Creates 6 specialized metafields: ingredients, recommendations, nutritional values, warnings, short descriptions, and client arguments Enforces strict formatting rules and regulatory compliance Generates clean HTML compatible with Shopify themes SEO Fields Agent (Claude Sonnet 4): Produces optimized meta descriptions for search engines Integrates keyword research data for maximum organic visibility Applies current year SEO best practices and anti-keyword stuffing techniques Shopify Integration & Updates Updates product descriptions via Shopify API Uploads metafields using GraphQL mutations Converts HTML to Shopify Rich Text format Tracks completion status in Airtable 🛠️ Setup Steps Core Integrations Shopify Access Token – For product data retrieval and content updates OpenRouter API – For GPT-4o and Claude model access Haloscan API – For keyword research and SERP analysis Perplexity API – For market intelligence and content strategy Airtable OAuth – For workflow management and keyword tracking Agent Configuration Orchestrator Agent – Central workflow management with routing logic Product Description Agent – SEO content generation with fact verification Meta Fields Agent – Structured metafield creation with compliance rules SEO Fields Agent – Meta description optimization with keyword integration Premium Keyword Discovery – Automated keyword research and analysis -SEO Compliance Checker – Cannibalization prevention and strategy generation Workflow Tools MCP Server Integration – Airtable data management HTTP Request Tools – Haloscan API communication Structured Output Parsers – Data validation and formatting Memory Buffer Windows – Conversation context management Rich Text Converters – Shopify-compatible content formatting 🎯 Workflow Capabilities Product Description Generation Length Control: 150-300 words with hard limits SEO Structure: Optimized heading hierarchy and keyword placement Fact Verification: Zero-invention policy with source validation Brand Compliance: Controlled brand mentions and positioning Metafield Creation 6 Specialized Fields: Arguments, ingredients, recommendations, nutrition, warnings, descriptions HTML Formatting: Clean structure with allowed tags only Regulatory Compliance: Industry-specific warnings and disclaimers Dynamic Content: Adapts to different product categories automatically Advanced SEO Features Keyword Research: Automated discovery with search volume analysis Cannibalization Prevention: SERP overlap detection and filtering Meta Optimization: Character-limited descriptions with CTR focus Content Strategy: AI-generated SEO roadmaps based on market data 🔐 Credentials Required Shopify Access Token – Product management and content publishing OpenRouter API Key – Multi-model AI access (GPT-4o, Claude variants) Haloscan API Key – Keyword research and SERP analysis Perplexity API Key – Market intelligence and content strategy Airtable OAuth – Database management and workflow tracking 👤 Ideal For E-commerce Teams scaling content creation across hundreds of products SEO Specialists implementing advanced cannibalization prevention strategies Shopify Store Owners seeking enterprise-level content automation Marketing Agencies building scalable, multi-client SEO workflows Product Managers requiring compliance-focused content generation 💬 Advanced Features Multi-Language Ready Workflow architecture supports easy extension to multiple markets and languages with minimal configuration changes. Compliance Framework Built-in regulatory compliance checking ensures content meets industry standards and legal requirements. Scalable Architecture Modular design allows adding new content types, AI models, or integration points without workflow restructuring. Error Handling & Retries Comprehensive error management with automatic retries and fallback mechanisms ensures reliable content generation. 💡 Pro Tip: This workflow represents a complete SEO content factory that can process hundreds of products daily while maintaining quality, compliance, and search engine optimization standards.
by Yuvraj Singh
Purpose This solution enables you to manage all your Notion and Todoist tasks from different workspaces as well as your calendar events in a single place. This is 2 way sync with partial support for recurring How it works The realtime sync consists of two workflows, both triggered by a registered webhook from either Notion or Todoist. To avoid overwrites by lately arriving webhook calls, every time the current task is retrieved from both sides. Redis is used to prevent from endless loops, since an update in one system triggers another webhook call again. Using the ID of the task, the trigger is being locked down for 80 seconds. Depending on the detected changes, the other side is updated accordingly .Generally Notion is treaded as the main source. Using an "Obsolete" Status, it is guaranteed, that tasks never get deleted entirely by accident. The Todoist ID is stored in the Notion task, so they stay linked together An additional full sync workflow daily fixes inconsistencies, if any of them occurred, since webhooks cannot be trusted entirely. Since Todoist requires a more complex setup, a tiny workflow helps with activating the webhook. Another tiny workflow helps generating a global config, which is used by all workflows for mapping purposes. Mapping (Notion >> Todoist) Name: Task Name Priority: Priority (1: do first, 2: urgent, 3: important, 4: unset) Due: Date Status: Section (Done: completed, Obsolete: deleted) <page_link>: Description (read-only) Todoist ID: <task_id> Current limitations Changes on the same task cannot be made simultaneously in both systems within a 15-20 second time frame. Subtasks are not linked automatically to their parent yet. Tasks names do not support URL’s yet. Credentials Follow the video: Setup credentials for Notion (access token), Todoist (access token) and Redis. Todoist Follow this video to get Todoist to obtain API Token. Todoist Credentials.mp4 Notion Follow this video to get Notion Integration Secret. Redis Follow this video to get Redis Setup The setup involves quite a lot of steps, yet many of them can be automated for business internal purposes. Just follow the video or do the following steps: Setup credentials for Notion (access token), Todoist (access token) and Redis - you can also create empty credentials and populate these later during further setup Clone this workflow by clicking the "Use workflow" button and then choosing your n8n instance - otherwise you need to map the credentials of many nodes. Follow the instructions described within the bundle of sticky notes on the top left of the workflow How to use You can apply changes (create, update, delete) to tasks both in Notion and Todoist which then get synced over within a couple of seconds (this is handled by the differential realtime sync) The daily running full sync, resolves possible discrepancies in Todoist. This workflow incorporates ideas and techniques inspired by Mario (https://n8n.io/creators/octionic/) whose expertise with specific nodes helped shape parts of this automation. Significant enhancements and customizations have been made to deliver a unique and improved solution.
by moosa
This workflow monitors product prices from BooksToScrape and sends alerts to a Discord channel via webhook when competitor's prices are lower than our prices. 🧩 Nodes Used Schedule (for daily or required schedule) If nodes (to check if checked or unchecked data exists) HTTP Request (for fetching product page ) Extract HTML (for extracting poduct price) Code(to clean and extract just the price number) Discord Webhook (send discord allerts) Sheets (extract and update) 🚀 How to Use Replace the Discord webhook URL with your own. Customize the scraping URL if you're monitoring a different site.(Sheet i used) Run the workflow manually or on a schedule. ⚠️ Important Do not use this for commercial scraping without permission. Ensure the site allows scraping (this example is for learning only).
by Yuvraj Singh
Purpose This solution enables you to manage all your Notion and Todoist tasks from different workspaces as well as your calendar events in a single place. This is 2 way sync with partial support for recurring How it works The realtime sync consists of two workflows, both triggered by a registered webhook from either Notion or Todoist. To avoid overwrites by lately arriving webhook calls, every time the current task is retrieved from both sides. Redis is used to prevent from endless loops, since an update in one system triggers another webhook call again. Using the ID of the task, the trigger is being locked down for 80 seconds. Depending on the detected changes, the other side is updated accordingly .Generally Notion is treaded as the main source. Using an "Obsolete" Status, it is guaranteed, that tasks never get deleted entirely by accident. The Todoist ID is stored in the Notion task, so they stay linked together An additional full sync workflow daily fixes inconsistencies, if any of them occurred, since webhooks cannot be trusted entirely. Since Todoist requires a more complex setup, a tiny workflow helps with activating the webhook. Another tiny workflow helps generating a global config, which is used by all workflows for mapping purposes. Mapping (Notion >> Todoist) Name: Task Name Priority: Priority (1: do first, 2: urgent, 3: important, 4: unset) Due: Date Status: Section (Done: completed, Obsolete: deleted) <page_link>: Description (read-only) Todoist ID: <task_id> Current limitations Changes on the same task cannot be made simultaneously in both systems within a 15-20 second time frame. Subtasks are not linked automatically to their parent yet. Tasks names do not support URL’s yet. Credentials Follow the video: Setup credentials for Notion (access token), Todoist (access token) and Redis. Todoist Follow this video to get Todoist to obtain API Token. Todoist Credentials.mp4 Notion Follow this video to get Notion Integration Secret. Redis Follow this video to get Redis Setup The setup involves quite a lot of steps, yet many of them can be automated for business internal purposes. Just follow the video or do the following steps: Setup credentials for Notion (access token), Todoist (access token) and Redis - you can also create empty credentials and populate these later during further setup Clone this workflow by clicking the "Use workflow" button and then choosing your n8n instance - otherwise you need to map the credentials of many nodes. Follow the instructions described within the bundle of sticky notes on the top left of the workflow How to use You can apply changes (create, update, delete) to tasks both in Notion and Todoist which then get synced over within a couple of seconds (this is handled by the differential realtime sync) The daily running full sync, resolves possible discrepancies in Todoist. This workflow incorporates ideas and techniques inspired by Mario (https://n8n.io/creators/octionic/) whose expertise with specific nodes helped shape parts of this automation. Significant enhancements and customizations have been made to deliver a unique and improved solution.
by Rajeet Nair
Overview This workflow automates CSV data processing from upload to database insertion. It accepts CSV files via webhook, uses AI to detect schema and standardize columns, cleans and validates the data, and stores it in Postgres. Errors are logged separately, and notifications are sent for visibility. How It Works CSV Upload A webhook receives CSV files for processing. Validation The workflow checks if the uploaded file is a valid CSV format. Invalid files are rejected with an error report. Data Extraction The CSV is parsed into structured rows for further processing. Schema Detection AI analyzes the data to: Infer column types Normalize column names Detect inconsistencies Data Normalization Values are cleaned and converted into proper formats (numbers, dates, booleans), with optional unit standardization. Data Quality Validation The workflow checks: Type mismatches Missing values Statistical outliers Conditional Processing Clean data → prepared and inserted into Postgres Errors → detailed report generated Database Insert Valid data is stored in the configured Postgres table. Error Logging Errors are logged into Google Sheets for tracking and debugging. Notifications A Slack message is sent with processing results. Setup Instructions Configure the webhook endpoint for CSV uploads Set your Postgres table name in the configuration node Add Anthropic/OpenAI credentials for schema detection Connect Slack for notifications Connect Google Sheets for error logging Configure error threshold settings Test with sample CSV files Activate the workflow Use Cases Cleaning and standardizing messy CSV data Automating ETL pipelines Preparing data for analytics or dashboards Validating incoming data before database storage Monitoring data quality with error reporting Requirements n8n instance with webhook access Postgres database OpenAI or Anthropic API access Slack workspace Google Sheets account Notes You can customize schema rules and normalization logic in the Code node. Adjust error thresholds based on your data tolerance. Extend validation rules for domain-specific requirements. Replace Postgres or Sheets with other storage systems if needed.
by Firecrawl
What this does Receives a URL via webhook, uses Firecrawl to scrape the page into clean markdown, and stores it as vector embeddings in Supabase pgvector. A visual, self-hosted ingestion pipeline for RAG knowledge bases. Adding a new source is as simple as sending a URL. The second part of the workflow exposes a chat interface where an AI Agent queries the stored knowledge base to answer questions, with Cohere reranking for better retrieval quality. How it works Part 1: Ingestion Pipeline Webhook receives a POST request with a url field Verify URL validates and normalizes the domain Supabase checks if the URL was already ingested (deduplication) If the URL already exists, ingestion is skipped; otherwise it continues Firecrawl fetches the page and converts it to clean markdown OpenAI generates vector embeddings from the scraped content Default Data Loader attaches the source URL as metadata Supabase Vector Store inserts the content and embeddings into pgvector Respond to Webhook confirms how many items were added Part 2: RAG Chat Agent Chat trigger receives a user question AI Agent (OpenRouter) queries the Supabase vector store filtered by URL Cohere Reranker improves retrieval quality before the agent responds Agent answers based solely on the ingested knowledge base Requirements Firecrawl API key OpenAI API key (for embeddings) OpenRouter API key (for the chat agent) Cohere API key (for reranking) Supabase project with pgvector enabled Setup Create a Supabase project and run the following SQL in the SQL editor: -- Enable the pgvector extension create extension vector with schema extensions; -- Create a table to store documents create table documents ( id bigserial primary key, content text, metadata jsonb, embedding extensions.vector(1536) ); -- Create a function to search for documents create function match_documents ( query_embedding extensions.vector(1536), match_count int default null, filter jsonb default '{}' ) returns table ( id bigint, content text, metadata jsonb, similarity float ) language plpgsql as $$ #variable_conflict use_column begin return query select id, content, metadata, 1 - (documents.embedding <=> query_embedding) as similarity from documents where metadata @> filter order by documents.embedding <=> query_embedding limit match_count; end; $$; Add your Firecrawl API key as a credential in n8n Add your OpenAI API key as a credential (for embeddings) Add your OpenRouter API key as a credential (for the chat agent) Add your Cohere API key as a credential (for reranking) Activate the workflow How to use Send a POST request to the webhook URL: curl -X POST https://your-n8n-instance/webhook/your-id \ -H "Content-Type: application/json" \ -d '{"url": "https://firecrawl.dev/docs"}' Then open the chat interface in n8n to ask questions about the ingested content.
by MANISH KUMAR
Shopify Product-to-Blog Automation with Perplexity web search, Gemini AI Agent & Google Sheets Shopify Blog Automation (From Shopify product to SEO-optimized blog post — fully automated) This workflow is an advanced n8n-powered automation that transforms newly created Shopify products into professionally written, SEO-ready blog posts using AI. By combining **Shopify Webhooks, Google Sheets, AI research, structured content generation, and automated HTML formatting**, this workflow removes all manual work from product-based content marketing. 💡 Key Advantages This Shopify Product-to-Blog Automation delivers the following benefits: 🛍️ Shopify Product Sync Automatically captures new product data (title, description, vendor, type, images) the moment a product is created. 🧠 AI-Powered Research & Writing Uses AI to perform market analysis, identify customer intent, and generate structured, high-quality blog content. 📊 Google Sheets Tracking Maintains a clear audit trail of products, generated blogs, and publishing status to prevent duplication. 🧩 Structured Content Output Generates strict JSON-based blog sections (problem, solution, features, usage, comparison, CTA) for consistency and scalability. 📤 End-to-End Automation Handles everything — from product detection to blog publishing — with zero manual writing. ⚙️ How It Works Step-by-Step Process Shopify Trigger Listens for products/create events in Shopify. Product Data Extraction Normalizes product fields and selects the primary product image. Google Sheets Storage Stores raw product data and sets initial processing status. AI Market & SEO Research Analyzes product intent, audience, use cases, FAQs, and keyword opportunities. AI Blog Content Generation Creates structured, SEO-focused blog content using a LangChain AI agent. HTML Structuring Cleans, escapes, and formats content into Shopify-safe, responsive HTML. Shopify Blog Publishing Automatically posts the article to the Shopify Blog via Admin API. Status Update & Tracking Updates Google Sheets to reflect successful blog publication. 🛠️ Setup Steps Required Node Configuration To run this workflow, configure the following nodes: Shopify Trigger** – Detect new product creation Set Node** – Normalize Shopify product fields Google Sheets Nodes** – Store and track workflow state AI Research Node** – Market & SEO analysis LangChain / Gemini Agent** – Blog content generation Code Node** – HTML formatting and safety handling HTTP Request Node** – Publish blog post to Shopify Error Handling Logic** – Retry and fail-safe routing 🔐 Credentials Required Before enabling the workflow, configure these credentials: Shopify Admin API Access Token** – For blog publishing Google Sheets OAuth** – For data tracking Google Gemini API Key** – For AI content generation Perplexity API Key** – For research and SEO insights 👤 Ideal For This automation is ideal for: Shopify store owners using content marketing Ecommerce teams managing large product catalogs SEO teams scaling product-driven blog content Agencies offering automated Shopify SEO solutions 💬 Bonus Tip This workflow is fully modular and extensible. You can easily enhance it to: Auto-link blogs to products Generate multilingual blog posts Schedule delayed publishing Route content by product category Add internal linking or schema markup All extensions can be implemented within the same n8n workflow. ✅ Result Every new Shopify product automatically becomes: Research-backed SEO-optimized Professionally structured Automatically published No manual writing. No copy-paste. Fully automated. Keywords shopify ai shopify automation shopify marketing automation shopify blog automation shopify content automation ai blog generator shopify shopify seo automation ecommerce automation ai ecommerce automation shopify workflow automation shopify product to blog auto generate shopify blogs shopify ai content how to automate shopify
by Nguyen Thieu Toan
How it works 🧠 AI-Powered News Update Bot for Zalo using Gemini and RSS Feeds This workflow allows you to build a smart Zalo chatbot that automatically summarizes and delivers the latest news using Google Gemini and RSS feeds. It’s perfect for keeping users informed with AI-curated updates directly inside Vietnam’s most popular messaging app. 🚀 What It Does Receives user messages via Zalo Bot webhook Fetches the latest articles from an RSS feed (e.g., AI news) Summarizes the content using Google Gemini Formats the response and sends it back to the user on Zalo 📱 What Is Zalo? Zalo is Vietnam’s leading instant messaging app, with over 78 million monthly active users—more than 85% of the country’s internet-connected population. It handles 2 billion messages per day and is deeply embedded in Vietnamese daily life, making it a powerful channel for communication and automation. 🔧 Setup Instructions 1. Create a Zalo Bot Open the Zalo app and search for "Zalo Bot Creator" Tap "Create Zalo Bot Account" Your bot name must start with "Bot" (e.g., Bot AI News) After creation, Zalo will send you a message containing your Bot Token 2. Configure the Webhook Replace [your-webhook URL] in Zalo Bot Creator with your n8n webhook URL Use the Webhook node in this workflow to receive incoming messages 3. Set Up Gemini Add your Gemini API key to the HTTP Request node labeled Summarize AI News Customize the prompt if you want a different tone or summary style 4. Customize RSS Feed Replace the default RSS URL with your preferred news source You can use any feed that provides timely updates (e.g., tech, finance, health) 🧪 Example Interaction User: "What's new today?" Bot: "🧠 AI Update: Google launches Gemini 2 with multimodal capabilities, revolutionizing how models understand text, image, and code..." ⚠️ Notes Zalo Bots currently do not support images, voice, or file attachments Make sure your Gemini API key has access to the model you're calling RSS feeds should be publicly accessible and well-formatted 🧩 Nodes Used Webhook HTTP Request (Gemini) RSS Feed Read Set & Format Zalo Message Sender (via API) 💡 Tips You can swap Gemini with GPT-4 or Claude by adjusting the API call Add filters to the RSS node to only include articles with specific keywords Use the Function node to personalize responses based on user history Built by Nguyen Thieu Toan (Nguyễn Thiệu Toàn) (https://nguyenthieutoan.com). Read more about this workflow by Vietnamese: https://nguyenthieutoan.com/share-workflow-n8n-zalo-bot-cap-nhat-tin-tuc/
by Ranjan Dailata
Disclaimer Please note - This workflow is only available on n8n self-hosted as it's making use of the community node for the Decodo Web Scraping This workflow automates intelligent keyword and topic extraction from Google Search results, combining Decodo’s advanced scraping engine with OpenAI GPT-4.1-mini’s semantic analysis capabilities. The result is a fully automated keyword enrichment pipeline that gathers, analyzes, and stores SEO-relevant insights. Who this is for This workflow is ideal for: SEO professionals** who want to extract high-value keywords from competitors. Digital marketers** aiming to automate topic discovery and keyword clustering. Content strategists** building data-driven content calendars. AI automation engineers** designing scalable web intelligence and enrichment pipelines. Growth teams** performing market and search intent research with minimal effort. What problem this workflow solves Manual keyword research is time-consuming and often incomplete. Traditional keyword tools only provide surface-level data and fail to uncover contextual topics or semantic relationships hidden in search results. This workflow solves that by: Automatically scraping live Google Search results for any keyword. Extracting meaningful topics, related terms, and entities using AI. Enriching your keyword list with semantic intelligence to improve SEO and content planning. Storing structured results directly in n8n Data Tables for trend tracking or export. What this workflow does Here’s a breakdown of the flow: Set the Input Fields – Define your search query and target geo (e.g., “Pizza” in “India”). Decodo Google Search – Fetches organic search results using Decodo’s web scraping API. Return Organic Results – Extracts the list of organic results and passes them downstream. Loop Over Each Result – Iterates through every search result description. Extract Keywords and Topics – Uses OpenAI GPT-4.1-mini to identify relevant keywords, entities, and thematic topics from each snippet. Data Enrichment Logic – Checks whether each result already exists in the n8n Data Table (based on URL). Insert or Skip – If a record doesn’t exist, inserts the extracted data into the table. Store Results – Saves both enriched search data and Decodo’s original response to disk. End Result: A structured and deduplicated dataset containing URLs, keywords, and key topics — ready for SEO tracking or further analytics. Setup Pre-requisite If you are new to Decode, please signup on this link visit.decodo.com Please make sure to install the n8n custom node for Decodo. Import and Configure the Workflow Open n8n and import the JSON template. Add your credentials: Decodo API Key under Decodo Credentials account. OpenAI API Key under OpenAI Account. Define Input Parameters Modify the Set node to define: search_query: your keyword or topic (e.g., “AI tools for marketing”) geo: the target region (e.g., “United States”) Configure Output The workflow writes two outputs: Enriched keyword data → Stored in n8n Data Table (DecodoGoogleSearchResults). Raw Decodo response → Saved locally in JSON format. Execute Click Execute Workflow or schedule it for recurring keyword enrichment (e.g., weekly trend tracking). How to customize this workflow Change AI Model** — Replace gpt-4.1-mini with gemini-1.5-pro or claude-3-opus for testing different reasoning strengths. Expand the Schema** — Add extra fields like keyword difficulty, page type, or author info. Add Sentiment Analysis** — Chain a second AI node to assess tone (positive, neutral, or promotional). Export to Sheets or DB** — Replace the Data Table node with Google Sheets, Notion, Airtable, or MySQL connectors. Multi-Language Research** — Pass a locale parameter in the Decodo node to gather insights in specific languages. Automate Alerts** — Add a Slack or Email node to notify your team when high-value topics appear. Summary Search & Enrich is a low-code AI-powered keyword intelligence engine that automates research and enrichment for SEO, content, and digital marketing. By combining Decodo’s real-time SERP scraping with OpenAI’s contextual understanding, the workflow transforms raw search results into structured, actionable keyword insights. It eliminates repetitive research work, enhances content strategy, and keeps your keyword database continuously enriched — all within n8n.
by franck fambou
⚠️ IMPORTANT: This template requires self-hosted n8n hosting due to the use of community nodes (MCP tools). It will not work on n8n Cloud. Make sure you have access to a self-hosted n8n instance before using this template. Overview This workflow automation allows a Google Gemini-powered AI Agent to orchestrate multi-source web intelligence using MCP (Model Context Protocol) tools such as Firecrawl, Brave Search, and Apify. The system allows users to interact with the agent in natural language, which then leverages various external data collection tools, processes the results, and automatically organizes them into structured spreadsheets. With built-in memory, flexible tool execution, and conversational capabilities, this workflow acts as a multi-agent research assistant, capable of retrieving, synthesizing, and delivering actionable insights in real time. How the system works AI Agent + MCP Pipeline User Interaction A chat message is received and forwarded to the AI Agent. AI Orchestration The agent, powered by Google Gemini, decides which MCP tools to invoke based on the query. Firecrawl-MCP: Recursive web crawling and content extraction. Brave-MCP: Real-time web search with structured results. Apify-MCP: Automation of web scraping tasks with scalable execution. Memory Management A memory module stores context across conversations, ensuring multi-turn reasoning and task continuity. Spreadsheet automation Results are structured in a new, automatically created Google Spreadsheet, enriched with formatting and additional metadata. Data processing The workflow generates the spreadsheet content, updates the sheet, and improves results via HTTP requests and field edits. Delivery of results Users receive a structured and contextualized dataset ready for review, analysis, or integration into other systems. Configuration instructions Estimated setup time: 45 minutes Prerequisites Self-hosted n8n instance (v0.200.0 or higher recommended) Google Gemini API key MCP-compatible nodes (Firecrawl, Brave, Apify) configured Google Sheets credentials for spreadsheet automation Detailed configuration steps Step 1: Configuring the AI Agent AI Agent node**: Select Google Gemini as the LLM model Configure your Google Gemini API key in the n8n credentials Set the system prompt to guide the agent's behavior Connect the Simple Memory node to enable context tracking Step 2: Integrating MCP Tools Firecrawl-MCP Configuration**: Install the @n8n/n8n-nodes-firecrawl-mcp package Configure your Firecrawl API key Set crawling parameters (depth, CSS selectors) Brave-MCP configuration**: Install the @n8n/n8n-nodes-brave-mcp package Add your Brave Search API key Configure search filters (region, language, SafeSearch) Apify-MCP configuration**: Install the @n8n/n8n-nodes-apify-mcp package Configure your Apify credentials Select the appropriate actors for your use cases Step 3: Spreadsheet automation “Create Spreadsheet” node**: Configure Google Sheets authentication (OAuth2 or Service Account) Set the file name with dynamic timestamps Specify the destination folder in Google Drive “Generate Spreadsheet Content” node**: Transform the agent's outputs into tabular format Define the columns: URL, Title, Description, Source, Timestamp Configure data formatting (dates, links, metadata) “Update Spreadsheet” node**: Insert the data into the created sheet Apply automatic formatting (headers, colors, column widths) Add summary formulas if necessary Step 4: Post-processing and delivery “Data Enrichment Request” node** (formerly “HTTP Request1”): Configure optional API calls to enrich the data Add additional metadata (geolocation, sentiment, categorization) Manage errors and timeouts “Edit Fields” node**: Refine the final dataset (metadata, tags, filters) Clean and normalize the data Prepare the final response for the user Structure of generated Google Sheets Default columns | Column | Description | Type | |---------|-------------|------| | URL | Data source URL | Hyperlink | | Title | Page/resource title | Text | | Description | Description or content excerpt | Long text | | Source | MCP tool used (Brave/Firecrawl/Apify) | Text | | Timestamp | Date/time of collection | Date/Time | | Metadata | Additional data (JSON) | Text | Automatic formatting Headings**: Bold font, colored background URLs**: Formatted as clickable links Dates**: Standardized ISO 8601 format Columns**: Width automatically adjusted to content Use cases Business and enterprise Competitive analysis combining search, crawling, and structured scraping Market trend research with multi-source aggregation Automated reporting pipelines for business intelligence Research and academia Literature discovery across multiple sources Data collection for research projects Automated bibliographic extraction from online sources Engineering and development Discovery of APIs and documentation Aggregation of product information from multiple platforms Scalable structured scraping for datasets Personal productivity Automated creation of newsletters or knowledge hubs Personal research assistant compiling spreadsheets from various online data Key features Multi-source intelligence Firecrawl for deep crawling Brave for real-time search Apify for structured web scraping AI-driven orchestration Google Gemini for reasoning and tool selection Memory for multi-turn interactions Context-based adaptive workflows Structured data output Automatic spreadsheet creation Data enrichment and formatting Ready-to-use datasets for reporting Performance and scalability Handles multiple simultaneous tool calls Scalable web data extraction Real-time aggregation from multiple MCPs Security and privacy Secure authentication based on API keys Data managed in Google Sheets / n8n Configurable retention and deletion policies Technical architecture Workflow User query → AI agent (Gemini) → MCP tools (Firecrawl / Brave / Apify) → Aggregated results → Spreadsheet creation → Data processing → Results delivery Supported data types Text and metadata** from crawled web pages Search results** from Brave queries Structured data** from Apify scrapers Tabular reports** via Google Sheets Integration options Chat interfaces Web widget for conversational queries Slack/Teams chatbot integration REST API access points Data sources Websites (via Firecrawl/Apify) Search engines (via Brave) APIs (via HTTP Request enrichment) Performance specifications Query response: < 5 seconds (search tasks) Crawl capacity: Thousands of pages per run Spreadsheet automation: Real-time creation and updates Accuracy: > 90% when using combined sources Advanced configuration options Customization Set custom prompts for the AI Agent Adjust the spreadsheet schema for reporting needs Configure retries for failed tool runs Analytics and monitoring Track tool usage and costs Monitor crawl and search success rates Log queries and outputs for auditing Troubleshooting and support Timeouts:** Manually re-run failed MCP executions Data gaps:** Validate Firecrawl/Apify selectors Spreadsheet errors:** Check Google Sheets API quotas
by AbSa~
🚀 Overview This workflow automates video uploads from Telegram directly to Google Drive, complete with smart file renaming, Google Sheets logging, and AI assistance via Google Gemini. It’s perfect for creators, educators, or organizations that want to streamline video submissions and file management. ⚙️ How It Works Telegram Trigger -> Start the workflow when a user sends a video file to your Telegram bot. Switch Node -> Detects file type or command and routes the flow accordingly. Get File -> Downloads the Telegram video file. Upload to Google Drive -> Automatically uploads the video to your chosen Drive folder. Smart Rename -> The file name is auto-formatted using dynamic logic (date, username, or custom tags). Google Sheets Logging -> Appends or updates upload data (e.g., filename, sender, timestamp) for easy tracking. AI Agent Integration -> Uses Google Gemini AI connected to Data Vidio memory to analyze or respond intelligently to user queries. Telegram Notification -> Sends confirmation or status messages back to Telegram. 🧠 Highlights Seamlessly integrates Telegram → Google Drive → Google Sheets → Gemini AI Supports file update or append mode Auto-rename logic via the Code node Works with custom memory tools for smarter AI responses Easy to clone and adapt, just connect your own credentials 🪄 Ideal Use Cases Video assignment submissions for schools or academies Media upload management for marketing teams Automated video archiving and AI-assisted review Personal Telegram-to-Drive backup assistant 🧩 Setup Tips Copy and use the provided Google Sheet template (SheetTemplate) Configure your Telegram Bot token, Google Drive, and Sheets credentials Update the AI Agent node with your Gemini API key and connect the Data Vidio sheet Test with a sample Telegram video before full automation
by Davide
This is an exaple of advanced automated data extraction and enrichment pipeline with ScrapeGraphAI. Its primary purpose is to systematically scrape the n8n community workflows website, extract detailed information about recently added workflows, process that data using multiple AI models, and store the structured results in a Google Sheets spreadsheet. This workflow demonstrates a sophisticated use of n8n to move beyond simple API calls and into the realm of intelligent, AI-driven web scraping and data processing, turning unstructured website content into valuable, structured business intelligence. Key Advantages ✅ Full Automation: Once triggered (manually or on a schedule via the Schedule Trigger node), the entire process runs hands-free, from data collection to spreadsheet population. ✅ Powerful AI-Augmented Scraping: It doesn't just scrape raw HTML. It uses multiple AI agents (Google Gemini, OpenAI) to: Understand page structure to find the right data on the main list. Clean and purify content from individual pages, removing and irrelevant information. Perform precise information extraction to parse unstructured text into structured JSON data based on a defined schema (author, price, etc.). Generate intelligent summaries, adding significant value by explaining the workflow's purpose in Italian. ✅ Robust and Structured Data Output: The use of the Structured Output Parser and Information Extractor nodes ensures the data is clean, consistent, and ready for analysis. It outputs perfectly formatted JSON that maps directly to spreadsheet columns. ✅ Scalability via Batching: The Split In Batches and Loop Over Items nodes allow the workflow to process a dynamically sized list of workflows. Whether there are 5 or 50 new workflows, it will process each one sequentially without failing. ✅ Effective Data Integration: It seamlessly integrates with Google Sheets, acting as a simple and powerful database. This makes the collected data immediately accessible, shareable, and available for visualization in tools like Looker Studio. ✅ Resilience to Website Changes: By using AI models trained to understand content and context (like "find the 'Recently Added' section" or "find the author's name"), the workflow is more resilient to minor cosmetic changes on the target website compared to traditional CSS/XPath selectors. How It Works The workflow operates in two main phases: Phase 1: Scraping the Main List Trigger: The workflow can be started manually ("Execute Workflow") or automatically on a schedule. Scraping: The "Scrape main page" node (using ScrapeGraphAI) fetches and converts the https://n8n.io/workflows/ page into clean Markdown format. Data Extraction: An LLM chain ("Extract 'Recently added'") analyzes the Markdown. It is specifically instructed to identify all workflow titles and URLs within the "Recently Added" section and output them as a structured JSON array named workflows. Data Preparation: The resulting array is set as a variable and then split out into individual items, preparing them for processing one-by-one. Phase 2: Processing Individual Workflows Loop: The "Loop Over Items" node iterates through each workflow URL obtained from Phase 1. Scrape & Clean Detail Page: For each URL, the "Scrape single Workflow" node fetches the detail page. Another LLM chain ("Main content") cleans the resulting Markdown, removing superfluous content and focusing only on the core article text. Information Extraction: The cleaned Markdown is passed to an "Information Extractor" node. This uses a language model to locate and structure specific data points (title, URL, ID, author, categories, price) into a defined JSON schema. Summarization: The cleaned Markdown is also sent to a Google Gemini node ("Summarization content"), which generates a concise Italian summary of the workflow's purpose and tools used. Data Consolidation & Export: The extracted information and the generated summary are merged into a single data object. Finally, the "Add row" node maps all this data to the appropriate columns and appends it as a new row in a designated Google Sheet. Set Up Steps To run this workflow, you need to configure the following credentials in your n8n instance: ScrapeGraphAI Account: The "Scrape main page" and "Scrape single Workflow" nodes require valid ScrapeGraphAI API credentials named ScrapegraphAI account. Install the related Community node. Google Gemini Account: Multiple nodes ("Google Gemini Chat Model", "Summarization content", etc.) require API credentials for Google Gemini named Google Gemini(PaLM) (Eure). OpenAI Account: The "OpenAI Chat Model1" node requires API credentials for OpenAI named OpenAi account (Eure). Google Sheets Account: The "Add row" node requires OAuth2 credentials for Google Sheets named Google Sheets account. You must also ensure the node is configured with the correct Google Sheet ID and that the sheet has a worksheet named Foglio1 (or update the node to match your sheet's name). Need help customizing? Contact me for consulting and support or add me on Linkedin.