by Khairul Muhtadin
Transform raw investment memorandums and financial decks into comprehensive, professional Due Diligence (DD) PDF reports. This workflow automates document parsing via LlamaParse, enriches internal data with real-time web intelligence using Decodo, and utilizes an AI Agent to synthesize structured financial analysis, risk assessments, and investment theses. Why Use This Workflow? Time Savings:** Reduces initial deal screening and report generation from 6–8 hours of manual analysis to under 5 minutes. Accuracy & Depth:** Employs a multi-query RAG (Retrieval-Augmented Generation) strategy that cross-references internal deal documents with verified external web evidence. Cost Reduction:** Eliminates the need for expensive junior analyst hours for preliminary data gathering and document summarization. Scalability:** Effortlessly processes multiple deals simultaneously, maintaining a consistent reporting standard across your entire pipeline. Ideal For Venture Capital & Private Equity:** Rapidly assessing incoming pitch decks and CIMs (Confidential Information Memorandums). M&A Advisory Teams:** Automating the creation of standardized target company profiles and risk summaries. Investment Analysts:** Generating structured data from unstructured PDFs to feed into internal valuation models. How It Works Trigger: A webhook receives document uploads (PDF, DOCX, PPTX) via a custom portal or API. Data Collection: LlamaParse converts complex document layouts into clean Markdown, preserving tables and financial structures. Processing: The workflow generates a unique "Deal ID" based on filenames to ensure data isolation and implements a caching layer via Pinecone to avoid redundant parsing. Intelligence Layer: Web Enrichment: The workflow derives the target company name and uses Decodo to scrape official websites for "About" and "Commercial Risk" data. Multi-Query RAG: An OpenAI-powered agent executes six specific retrieval queries (Financials, Risks, Business Model, etc.) to gather evidence from all sources. Output & Delivery: Analysis is mapped to a structured template, rendered into a professional HTML report, and converted to a high-quality PDF using Puppeteer. Storage & Logging: The final report is uploaded to Cloudflare R2, and a public, secure URL is returned to the user instantly. Setup Guide Prerequisites | Requirement | Type | Purpose | | --- | --- | --- | | n8n instance | Essential | Core automation and workflow orchestration | | LlamaIndex Cloud | Essential | High-accuracy document parsing (LlamaParse) | | Pinecone | Essential | Vector database for document and web evidence storage | | OpenAI API | Essential | LLM for embeddings and expert analysis (Embedding Small & GPT-5.2) | | Decodo API | Essential | Real-time web searching and markdown scraping | | R2 Bucket | Essential | Secure storage for the generated PDF reports | Installation Steps Import the JSON file to your n8n instance. Configure credentials: OpenAI: Add your API key for embeddings and the Chat Model. Pinecone: Enter your API Key and Index name (default: poc). LlamaIndex: Add your API key under Header Auth (Authorization: Bearer YOUR_KEY). Decodo: Set up your Decodo API credentials for web search and scraping. AWS S3: Configure your bucket name and access keys. Update environment-specific values: In the "Build Public Report URL" node, update the baseUrl to match your S3 bucket's public endpoint or CDN. Test execution: Send a POST request to the webhook URL with a binary file (e.g., a Pitch Deck) to verify the end-to-end generation. Technical Details Core Nodes | Node | Purpose | Key Configuration | | --- | --- | --- | | LlamaParse (HTTP) | Document Conversion | Uses the /parsing/upload and /job/result endpoints for high-fidelity markdown | | Pinecone Vector Store | Context Storage | Implements namespace-based isolation using the unique dealId | | Decodo Search/Scrape | Web Intelligence | Dynamically identifies the official domain and extracts corporate metadata | | AI Agent | Strategic Analysis | Configured with a "Senior Investment Analyst" system prompt and 6-step retrieval logic | | Puppeteer | PDF Generation | Renders the styled HTML report into a print-ready A4 PDF | Workflow Logic The workflow uses a Multi-Query Retrieval strategy. Instead of asking one generic question, the AI Agent is forced to perform six distinct searches against the vector database (Revenue History, Key Risks, etc.). This ensures that even if a document is 100 pages long, the AI doesn't "miss" critical financial tables or risk disclosures buried in the text. Customization Options Basic Adjustments Report Styling:** Edit the "Render DD Report HTML" node to match your firm's branding (logo, colors, fonts). Analysis Scope:** Modify the AI Agent's prompt to include specific metrics (e.g., "ESG Score" or "Technical Debt Assessment"). Advanced Enhancements Slack/Email Integration:** Instead of just an S3 link, have n8n send the PDF directly to a #new-deals Slack channel. CRM Sync:** Automatically create a new record in HubSpot or Salesforce with the structured JSON output attached. Troubleshooting | Problem | Cause | Solution | | --- | --- | --- | | Parsing Timeout | File is too large for synchronous processing | Increase the "Wait" node duration or check LlamaParse job limits | | Low Analysis Quality | Insufficient context in documents | Ensure documents are text-based PDFs (not scans) or enable OCR in LlamaParse | | PDF Layout Broken | CSS incompatibility in Puppeteer | Simplify CSS in the HTML node; avoid complex Flexbox/Grid if Puppeteer version is older | Use Case Examples Scenario 1: Venture Capital Deal Screening Challenge: A VC associate receives 20 pitch decks a day and spends hours manually summarizing company profiles. Solution: This workflow parses the deck and web-scrapes the startup's site to verify claims. Result: The associate receives a 3-page PDF summary for every deck, allowing them to reject or move forward in seconds. Scenario 2: Private Equity Due Diligence Challenge: Analyzing a 150-page CIM (Information Memorandum) for specific financial "red flags." Solution: The AI Agent is programmed to specifically hunt for customer concentration and margin fluctuations. Result: Consistent risk identification across all deals, regardless of which analyst is assigned to the project. Created by: Khmuhtadin Category: Business Intelligence | Tags: Decodo, AI, RAG, Due Diligence, LlamaIndex, Pinecone Need custom workflows? Contact us Connect with the creator: Portfolio • Store • LinkedIn • Medium • Threads
by Atta
Never guess your SEO strategy again. This advanced workflow automates the most time-consuming part of SEO: auditing competitor articles and identifying exactly where your brand can outshine them. It extracts deep content from top-ranking URLs, compares it against your specific brand identity, and generates a ready-to-use "Action Plan" for your content team. The workflow uses Decodo for high-fidelity scraping, Gemini 2.5 Flash for strategic gap analysis, and Google Sheets as a dynamic "Brand Brain" and reporting dashboard. ✨ Key Features Brand-Centric Auditing:* Unlike generic SEO tools, this engine uses a live Google Sheet containing your *Brand Identity** to find "Content Gaps" specific to your unique value proposition. Automated SERP Itemization:** Converts a simple list of keywords into a filtered list of top-performing competitor URLs. Deep Markdown Extraction:** Uses Decodo Universal to bypass bot-blockers and extract clean Markdown content, preserving headers and structure for high-fidelity AI analysis. Structured Action Plans:** Outputs machine-readable JSON containing the competitor's H1, their "Winning Factor," and a 1-sentence "Checkmate" instruction for your writers. ⚙️ How it Works Data Foundation: The workflow triggers (Manual or Scheduled) and pulls your Global Config (e.g., result limits) and Brand Identity from a dedicated Google Sheet. Market Discovery: It retrieves your target keywords and uses the Decodo Google Search node to identify the top competitors. A Code Node then "itemizes" these results into individual URLs. Intelligence Harvesting: Decodo Universal scrapes each URL, and an HTML 5 node extracts the body content into Markdown format to minimize token noise for the AI. Strategic Audit: The AI Content Auditor (powered by Gemini) receives the competitor’s text and your Brand Identity. It identifies what the competitor missed that your brand excels at. Reporting Deck: The final Strategy Master Writer node appends the analysis—including the "Content Gap" and "Action Plan"—into a master Google Sheet for your marketing team. 📥 Component Installation This workflow relies on the Decodo node for search and scraping precision. Install Node: Click the + button in n8n, search for "Decodo," and add it to your canvas. Credentials: Use your Decodo API key. (Tip: Use a residential proxy setting for difficult sites like Reddit or Stripe). Gemini: Ensure you have the Google Gemini Chat Model node connected to the AI Agent. 🎁 Get a free Web Scraping API subscription here 👉🏻 https://visit.decodo.com/X4YBmy 🛠️ Setup Instructions 1. Google Sheets Configuration Create a spreadsheet with the following three tabs: Target Keywords**: One column named Target Keyword. Brand Identity**: One cell containing your brand mission, USPs, and target audience. Competitor Audit Feed**: Headers for Keyword, URL, Rank, Winning Factor, Content Gap, and Action Plan. Clone the spreadsheet here. 2. Global Configuration In the Config (Set) node, define your serp_results_amount (e.g., 10). This controls how many competitors are analyzed per keyword. ➕ How to Adapt the Template Competitor Exclusion:* Add a *Filter** node after "Market Discovery" to automatically skip domains like amazon.com or reddit.com if they aren't relevant to your niche. Slack Alerts:* Connect a *Slack** node after the AI analysis to notify your content manager immediately when a high-impact "Action Plan" is generated for a priority keyword. Multi-Model Verification:* Swap Gemini with *Claude 3.5 Sonnet* or *GPT-4o** in the Strategic Audit section to compare different AI perspectives on the same competitor content.
by Mariela Slavenova
This template crawls a website from its sitemap, deduplicates URLs in Supabase, scrapes pages with Crawl4AI, cleans and validates the text, then stores content + metadata in a Supabase vector store using OpenAI embeddings. It’s a reliable, repeatable pipeline for building searchable knowledge bases, SEO research corpora, and RAG datasets. ⸻ Good to know • Built-in de-duplication via a scrape_queue table (status: pending/completed/error). • Resilient flow: waits, retries, and marks failed tasks. • Costs depend on Crawl4AI usage and OpenAI embeddings. • Replace any placeholders (API keys, tokens, URLs) before running. • Respect website robots/ToS and applicable data laws when scraping. How it works Sitemap fetch & parse — Load sitemap.xml, extract all URLs. De-dupe — Normalize URLs, check Supabase scrape_queue; insert only new ones. Scrape — Send URLs to Crawl4AI; poll task status until completed. Clean & score — Remove boilerplate/markup, detect content type, compute quality metrics, extract metadata (title, domain, language, length). Chunk & embed — Split text, create OpenAI embeddings. Store — Upsert into Supabase vector store (documents) with metadata; update job status. Requirements • Supabase (Postgres + Vector extension enabled) • Crawl4AI API key (or header auth) • OpenAI API key (for embeddings) • n8n credentials set for HTTP, Postgres/Supabase How to use Configure credentials (Supabase/Postgres, Crawl4AI, OpenAI). (Optional) Run the provided SQL to create scrape_queue and documents. Set your sitemap URL in the HTTP Request node. Execute the workflow (manual trigger) and monitor Supabase statuses. Query your documents table or vector store from your app/RAG stack. Potential Use Cases This automation is ideal for: Market research teams collecting competitive data Content creators monitoring web trends SEO specialists tracking website content updates Analysts gathering structured data for insights Anyone needing reliable, structured web content for analysis Need help customizing? Contact me for consulting and support: LinkedIn
by Growth AI
SEO Content Generation Workflow - n8n Template Instructions Who's it for This workflow is designed for SEO professionals, content marketers, digital agencies, and businesses who need to generate optimized meta tags, H1 headings, and content briefs at scale. Perfect for teams managing multiple clients or large keyword lists who want to automate competitor analysis and SEO content creation while maintaining quality and personalization. How it works The workflow automates the entire SEO content creation process by analyzing your target keywords against top competitors, then generating optimized meta elements and comprehensive content briefs. It uses AI-powered analysis combined with real competitor data to create SEO-friendly content that's tailored to your specific business context. The system processes keywords in batches, performs Google searches, scrapes competitor content, analyzes heading structures, and generates personalized SEO content using your company's database information for maximum relevance. Requirements Required Services and Credentials Google Sheets API**: For reading configuration and updating results Anthropic API**: For AI content generation (Claude Sonnet 4) OpenAI API**: For embeddings and vector search Apify API**: For Google search results Firecrawl API**: For competitor website scraping Supabase**: For vector database (optional but recommended) Template Spreadsheet Copy this template spreadsheet and configure it with your information: Template Link How to set up Step 1: Copy and Configure Template Make a copy of the template spreadsheet Fill in the Client Information sheet: Client name: Your company or client's name Client information: Brief business description URL: Website address Supabase database: Database name (prevents AI hallucination) Tone of voice: Content style preferences Restrictive instructions: Topics or approaches to avoid Complete the SEO sheet with your target pages: Page: Page you're optimizing (e.g., "Homepage", "Product Page") Keyword: Main search term to target Awareness level: User familiarity with your business Page type: Category (homepage, blog, product page, etc.) Step 2: Import Workflow Import the n8n workflow JSON file Configure all required API credentials in n8n: Google Sheets OAuth2 Anthropic API key OpenAI API key Apify API key Firecrawl API key Supabase credentials (if using vector database) Step 3: Test Configuration Activate the workflow Send your Google Sheets URL to the chat trigger Verify that all sheets are readable and credentials work Test with a single keyword row first Workflow Process Overview Phase 0: Setup and Configuration Copy template spreadsheet Configure client information and SEO parameters Set up API credentials in n8n Phase 1: Data Input and Processing Chat trigger receives Google Sheets URL System reads client configuration and SEO data Filters valid keywords and empty H1 fields Initiates batch processing Phase 2: Competitor Research and Analysis Searches Google for top 10 results per keyword Scrapes first 5 competitor websites Extracts heading structures (H1-H6) Analyzes competitor meta tags and content organization Phase 3: Meta Tags and H1 Generation AI analyzes keyword context and competitor data Accesses client database for personalization Generates optimized meta title (65 chars max) Creates compelling meta description (165 chars max) Produces user-focused H1 (70 chars max) Phase 4: Content Brief Creation Analyzes search intent percentages Develops content strategy based on competitor analysis Creates detailed MECE page structure Suggests rich media elements Provides writing recommendations and detail level scoring Phase 5: Data Integration and Updates Combines all generated content into unified structure Updates Google Sheets with new SEO elements Preserves existing data while adding new content Continues batch processing for remaining keywords How to customize the workflow Adjusting AI Models Replace Anthropic Claude with other LLM providers Modify system prompts for different content styles Adjust character limits for meta elements Modifying Competitor Analysis Change number of competitors analyzed (currently 5) Adjust scraping parameters in Firecrawl nodes Modify heading extraction logic in JavaScript nodes Customizing Output Format Update Google Sheets column mapping in Code node Modify structured output parser schema Change batch processing size in Split in Batches node Adding Quality Controls Insert validation nodes between phases Add error handling and retry logic Implement content quality scoring Extending Functionality Add keyword research capabilities Include image optimization suggestions Integrate social media content generation Connect to CMS platforms for direct publishing Best Practices Test with small batches before processing large keyword lists Monitor API usage and costs across all services Regularly update system prompts based on output quality Maintain clean data in your Google Sheets template Use descriptive node names for easier workflow maintenance Troubleshooting API Errors**: Check credential configuration and usage limits Scraping Failures**: Firecrawl nodes have error handling enabled Empty Results**: Verify keyword formatting and competitor availability Sheet Updates**: Ensure proper column mapping in final Code node Processing Stops**: Check batch processing limits and timeout settings
by Growth AI
SEO Content Generation Workflow (Basic Version) - n8n Template Instructions Who's it for This workflow is designed for SEO professionals, content marketers, digital agencies, and businesses who need to generate optimized meta tags, H1 headings, and content briefs at scale. Perfect for teams managing multiple clients or large keyword lists who want to automate competitor analysis and SEO content creation without the complexity of vector databases. How it works The workflow automates the entire SEO content creation process by analyzing your target keywords against top competitors, then generating optimized meta elements and comprehensive content briefs. It uses AI-powered analysis combined with real competitor data to create SEO-friendly content that's tailored to your specific business context. The system processes keywords in batches, performs Google searches, scrapes competitor content, analyzes heading structures, and generates personalized SEO content using your company information for maximum relevance. Requirements Required Services and Credentials Google Sheets API**: For reading configuration and updating results Anthropic API**: For AI content generation (Claude Sonnet 4) Apify API**: For Google search results Firecrawl API**: For competitor website scraping Template Spreadsheet Copy this template spreadsheet and configure it with your information: Template Link How to set up Step 1: Copy and Configure Template Make a copy of the template spreadsheet Fill in the Client Information sheet: Client name: Your company or client's name Client information: Brief business description URL: Website address Tone of voice: Content style preferences Restrictive instructions: Topics or approaches to avoid Complete the SEO sheet with your target pages: Page: Page you're optimizing (e.g., "Homepage", "Product Page") Keyword: Main search term to target Awareness level: User familiarity with your business Page type: Category (homepage, blog, product page, etc.) Step 2: Import Workflow Import the n8n workflow JSON file Configure all required API credentials in n8n: Google Sheets OAuth2 Anthropic API key Apify API key Firecrawl API key Step 3: Test Configuration Activate the workflow Send your Google Sheets URL to the chat trigger Verify that all sheets are readable and credentials work Test with a single keyword row first Workflow Process Overview Phase 0: Setup and Configuration Copy template spreadsheet Configure client information and SEO parameters Set up API credentials in n8n Phase 1: Data Input and Processing Chat trigger receives Google Sheets URL System reads client configuration and SEO data Filters valid keywords and empty H1 fields Initiates batch processing Phase 2: Competitor Research and Analysis Searches Google for top 10 results per keyword using Apify Scrapes first 5 competitor websites using Firecrawl Extracts heading structures (H1-H6) from competitor pages Analyzes competitor meta tags and content organization Processes markdown content to identify heading hierarchies Phase 3: Meta Tags and H1 Generation AI analyzes keyword context and competitor data using Claude Incorporates client information for personalization Generates optimized meta title (65 characters maximum) Creates compelling meta description (165 characters maximum) Produces user-focused H1 (70 characters maximum) Uses structured output parsing for consistent formatting Phase 4: Content Brief Creation Analyzes search intent percentages (informational, transactional, navigational) Develops content strategy based on competitor analysis Creates detailed MECE page structure with H2 and H3 sections Suggests rich media elements (images, videos, infographics, tables) Provides writing recommendations and detail level scoring (1-10 scale) Ensures SEO optimization while maintaining user relevance Phase 5: Data Integration and Updates Combines all generated content into unified structure Updates Google Sheets with new SEO elements Preserves existing data while adding new content Continues batch processing for remaining keywords Key Differences from Advanced Version This basic version focuses on core SEO functionality without additional complexity: No Vector Database**: Removes Supabase integration for simpler setup Streamlined Architecture**: Fewer dependencies and configuration steps Essential Features Only**: Core competitor analysis and content generation Faster Setup**: Reduced time to deployment Lower Costs**: Fewer API services required How to customize the workflow Adjusting AI Models Replace Anthropic Claude with other LLM providers in the agent nodes Modify system prompts for different content styles or languages Adjust character limits for meta elements in the structured output parser Modifying Competitor Analysis Change number of competitors analyzed (currently 5) by adding/removing Scrape nodes Adjust scraping parameters in Firecrawl nodes for different content types Modify heading extraction logic in JavaScript Code nodes Customizing Output Format Update Google Sheets column mapping in the final Code node Modify structured output parser schema for different data structures Change batch processing size in Split in Batches node Adding Quality Controls Insert validation nodes between workflow phases Add error handling and retry logic to critical nodes Implement content quality scoring mechanisms Extending Functionality Add keyword research capabilities with additional APIs Include image optimization suggestions Integrate social media content generation Connect to CMS platforms for direct publishing Best Practices Setup and Testing Always test with small batches before processing large keyword lists Monitor API usage and costs across all services Regularly update system prompts based on output quality Maintain clean data in your Google Sheets template Content Quality Review generated content before publishing Customize system prompts to match your brand voice Use descriptive node names for easier workflow maintenance Keep competitor analysis current by running regularly Performance Optimization Process keywords in small batches to avoid timeouts Set appropriate retry policies for external API calls Monitor workflow execution times and optimize bottlenecks Troubleshooting Common Issues and Solutions API Errors Check credential configuration in n8n settings Verify API usage limits and billing status Ensure proper authentication for each service Scraping Failures Firecrawl nodes have error handling enabled to continue on failures Some websites may block scraping - this is normal behavior Check if competitor URLs are accessible and valid Empty Results Verify keyword formatting in Google Sheets Ensure competitor websites contain the expected content structure Check if meta tags are properly formatted in system prompts Sheet Update Errors Ensure proper column mapping in final Code node Verify Google Sheets permissions and sharing settings Check that target sheet names match exactly Processing Stops Review batch processing limits and timeout settings Check for errors in individual nodes using execution logs Verify all required fields are populated in input data Template Structure Required Sheets Client Information: Business details and configuration SEO: Target keywords and page information Results Sheet: Where generated content will be written Expected Columns Keywords**: Target search terms Description**: Brief page description Type de page**: Page category Awareness level**: User familiarity level title, meta-desc, h1, brief**: Generated output columns This streamlined version provides all essential SEO content generation capabilities while being easier to set up and maintain than the advanced version with vector database integration.
by Avkash Kakdiya
How it works This workflow enriches and personalizes your lead profiles by integrating HubSpot contact data, scraping social media information, and using AI to generate tailored outreach emails. It streamlines the process from contact capture to sending a personalized email — all automatically. The system fetches new or updated HubSpot contacts, verifies and enriches their Twitter/LinkedIn data via Phantombuster, merges the profile and engagement insights, and finally generates a customized email ready for outreach. Step-by-step 1. Trigger & Input HubSpot Contact Webhook: Fires when a contact is created or updated in HubSpot. Fetch Contact: Pulls the full contact details (email, name, company, and social profiles). Update Google Sheet: Logs Twitter/LinkedIn usernames and marks their tracking status. 2. Validation Validate Twitter/LinkedIn Exists: Checks if the contact has a valid social profile before proceeding to scraping. 3. Social Media Scraping (via Phantombuster) Launch Profile Scraper & 🎯 Launch Tweet Scraper: Triggers Phantombuster agents to fetch profile details and recent tweets. Wait Nodes: Ensures scraping completes (30–60 seconds). Fetch Profile/Tweet Results: Retrieves output files from Phantombuster. Extract URL: Parses the job output to extract the downloadable .json or .csv data file link. 4. Data Download & Parsing Download Profile/Tweet Data: Downloads scraped JSON files. Parse JSON: Converts the raw file into structured data for processing. 5. Data Structuring & Merging Format Profile Fields: Maps stats like bio, followers, verified status, likes, etc. Format Tweet Fields: Captures tweet data and associates it with the lead’s email. Merge Data Streams: Combines tweet and profile datasets. Combine All Data: Produces a single, clean object containing all relevant lead details. 6. AI Email Generation & Delivery Generate Personalized Email: Feeds the merged data into OpenAI GPT (via LangChain) to craft a custom HTML email using your brand details. Parse Email Content: Cleans AI output into structured subject and body fields. Sends Email: Automatically delivers the personalized email to the lead via Gmail. Benefits Automated Lead Enrichment — Combines CRM and real-time social media data with zero manual research. Personalized Outreach at Scale — AI crafts unique, relevant emails for each contact. Improved Engagement Rates — Targeted messages based on actual social activity and profile details. Seamless Integration — Works directly with HubSpot, Google Sheets, Gmail, and Phantombuster. Time & Effort Savings — Replaces hours of manual lookup and email drafting with an end-to-end automated flow.
by Zain Khan
Categories: Business Automation, Customer Support, AI, Knowledge Management This comprehensive workflow enables businesses to build and deploy a custom-trained AI Chatbot in minutes. By combining a sophisticated data scraping engine with a RAG-based (Retrieval-Augmented Generation) chat interface, it allows you to transform website content into a high-performance support agent. Powered by Google Gemini and Pinecone, this system ensures your chatbot provides accurate, real-time answers based exclusively on your business data. Benefits Instant Knowledge Sync** - Automatically crawls sitemaps and URLs to keep your AI up-to-date with your latest website content. Embeddable Anywhere** - Features a ready-to-use chat trigger that can be integrated into the bottom-right of any website via a simple script. High-Fidelity Retrieval** - Uses vector embeddings to ensure the AI "searches" your documentation before answering, reducing hallucinations. Smart Conversational Memory** - Equipped with a 10-message window buffer, allowing the bot to handle complex follow-up questions naturally. Cost-Efficient Scaling** - Leverages Gemini’s efficient API and Pinecone’s high-speed indexing to manage thousands of customer queries at a low cost. How It Works Dual-Path Ingestion: The process begins with an n8n Form where you provide a sitemap or individual URLs. The workflow automatically handles XML parsing and URL cleaning to prepare a list of pages for processing. Clean Content Extraction: Using Decodo, the workflow fetches the HTML of each page and uses a specialized extraction node to strip away code, ads, and navigation, leaving only the high-value text content. SignUp using: dashboard.decodo.com/register?referral_code=55543bbdb96ffd8cf45c2605147641ee017e7900. Vectorization & Storage: The cleaned text is passed to the Gemini Embedding model, which converts the information into 3076-dimensional vectors. These are stored in a Pinecone "supportbot" index for instant retrieval. RAG-Powered Chat Agent: When a user sends a message through the chat widget, an AI Agent takes over. It uses the user's query to search the Pinecone database for relevant business facts. Intelligent Response Generation: The AI Agent passes the retrieved facts and the current chat history to Google Gemini, which generates a polite, accurate, and contextually relevant response for the user. Requirements n8n Instance:** A self-hosted or cloud instance of n8n. Google Gemini API Key:** For text embeddings and chat generation. Pinecone Account:** An API key and a "supportbot" index to store your knowledge base. Decodo Access:** For high-quality website content extraction. How to Use Initialize the Knowledge Base: Use the Form Trigger to input your website URL or Sitemap. Run the ingestion flow to populate your Pinecone index. Configure Credentials: Authenticate your Google Gemini and Pinecone accounts within n8n. Deploy the Chatbot: Enable the Chat Trigger node. Use the provided webhook URL to connect the backend to your website's frontend chat widget. Test & Refine: Interact with the bot to ensure it retrieves the correct data, and update your knowledge base by re-running the ingestion flow whenever your website content changes. Business Use Cases Customer Support Teams** - Automate answers to 80% of common FAQs using your existing documentation. E-commerce Sites** - Help customers find product details, shipping policies, and return information instantly. SaaS Providers** - Build an interactive technical documentation assistant to help users navigate your software. Marketing Agencies** - Offer "AI-powered site search" as an add-on service for client websites. Efficiency Gains Reduce Ticket Volume** by providing instant self-service options. Eliminate Manual Data Entry** by scraping content directly from the live website. Improve UX** with 24/7 availability and zero wait times for customers. Difficulty Level: Intermediate Estimated Setup Time: 30 min Monthly Operating Cost: Low (variable based on AI usage and Pinecone tier)
by Hugo Le Poole
Generate AI voice receptionist agents for local businesses using VAPI Automate the creation of personalized AI phone receptionists for local businesses by scraping Google Maps, analyzing websites, and deploying voice agents to VAPI. Who is this for? Agencies** offering AI voice solutions to local businesses Consultants** helping SMBs modernize their phone systems Developers** building lead generation tools for voice AI services Entrepreneurs** launching AI receptionist services at scale What this workflow does This workflow automates the entire process of creating customized AI voice agents: Collects business criteria through a form (city, keywords, quantity) Scrapes Google Maps for matching local businesses using Apify Fetches and analyzes each business website Generates tailored voice agent prompts using Claude AI Automatically provisions voice assistants via VAPI API Logs all created agents to Google Sheets for tracking The AI adapts prompts based on business type (salon, restaurant, dentist, spa) with appropriate tone, services, and booking workflows. Setup requirements Apify account** with Google Maps Scraper actor access Anthropic API key** for prompt generation OpenRouter API key** for website analysis VAPI account** with API access Google Sheets** connected via OAuth How to set up Import the workflow template Add your Apify credentials to the scraping node Configure Anthropic and OpenRouter API keys Replace YOUR_VAPI_API_KEY in the HTTP Request node header Connect your Google Sheets account Create a Google Sheet with columns: Business Name, Category, Address, Phone, Agent ID, Agent URL Update the Sheet URL in both Google Sheets nodes Activate the workflow and submit the form Customization options Business templates**: Edit the prompt in "Generate Agent Messages" to add new business categories Voice settings**: Modify ElevenLabs voice parameters (stability, similarity boost) LLM model**: Switch between GPT-4, Claude, or other models via OpenRouter Output format**: Customize the results page HTML in the final Form node
by TOMOMITSU ASANO
Intelligent Invoice Processing with AI Classification and XML Export Summary Automated invoice processing pipeline that extracts data from PDF invoices, uses AI Agent for intelligent expense categorization, generates XML for accounting systems, and routes high-value invoices for approval. Detailed Description A comprehensive accounts payable automation workflow that monitors for new PDF invoices, extracts text content, uses AI to classify expenses and detect anomalies, converts to XML format for accounting system integration, and implements approval workflows for high-value or unusual invoices. Key Features PDF Text Extraction**: Extract from File node parses invoice PDFs automatically AI-Powered Classification**: AI Agent categorizes expenses, suggests GL codes, detects anomalies XML Export**: Convert structured data to accounting-compatible XML format Approval Workflow**: Route invoices over $5,000 or low confidence for human review Multi-Trigger Support**: Google Drive monitoring or manual webhook upload Comprehensive Logging**: Archive all processed invoices to Google Sheets Use Cases Accounts payable automation Expense report processing Vendor invoice management Financial document digitization Audit trail generation Required Credentials Google Drive OAuth (for PDF source folder) OpenAI API key Slack Bot Token Gmail OAuth Google Sheets OAuth Node Count: 24 (19 functional + 5 sticky notes) Unique Aspects Uses Extract from File node for PDF text extraction (rarely used) Uses XML node for JSON to XML conversion (very rare) Uses AI Agent node for intelligent classification Uses Google Drive Trigger for file monitoring Implements approval workflow with conditional routing Webhook response** mode for API integration Workflow Architecture [Google Drive Trigger] [Manual Webhook] | | +----------+-----------+ | v [Filter PDF Files] | v [Download Invoice PDF] | v [Extract PDF Text] | v [Parse Invoice Data] (Code) | v [AI Invoice Classifier] <-- [OpenAI Chat Model] | v [Parse AI Classification] | v [Convert to XML] | v [Format XML Output] | v [Needs Approval?] (If) / \ Yes (>$5000) No (Auto) | | [Email Approval] [Slack Notify] | | +------+-------+ | v [Archive to Google Sheets] | v [Respond to Webhook] Configuration Guide Google Drive: Set folder ID to monitor in Drive Trigger node Approval Threshold: Default $5,000, adjust in "Needs Approval?" node Email Recipients: Configure finance-approvers@example.com Slack Channel: Set #finance-notifications for updates GL Codes: AI suggests codes; customize in AI prompt if needed Google Sheets: Configure document for invoice archive
by giangxai
Overview Automatically generate viral short-form health videos using AI and publish them to social platforms with n8n and Veo 3. This workflow collects viral ideas, analyzes engagement patterns, generates AI video scripts, renders videos with Veo 3, and handles publishing and tracking fully automated, with no manual editing. Who is this for? This template is ideal for: Content creators building faceless health channels (Shorts, Reels, TikTok) Affiliate marketers promoting health products with video content AI marketers running high-volume short-form content funnels Automation builders combining LLMs, video AI, and n8n Teams that want a scalable, repeatable system for viral AI video production If you want to create health niche videos at scale without manually scripting, rendering, and uploading each video, this workflow is for you. What problem is this workflow solving? Creating viral short-form health videos usually involves many manual steps and disconnected tools, such as: Manually collecting and validating viral content ideas Writing hooks and scripts for each video Switching between AI tools for analysis and video generation Waiting for videos to render and checking status manually Uploading videos and tracking what has been published This workflow connects all these steps into a single automated pipeline and removes repetitive manual work. What this workflow does This automated AI health video workflow: Runs on a defined schedule Collects viral health content ideas from external sources Normalizes and stores ideas in Google Sheets Loads pending viral ideas for processing Analyzes each idea and generates AI-optimized video scripts Creates AI videos automatically using the Veo 3 API Waits for video rendering and checks completion status Retrieves the final rendered videos Optionally aggregates or merges video assets Publishes videos to social platforms Updates Google Sheets with processing and publishing results The entire process runs end-to-end with minimal human intervention. Setup 1. Prepare Google Sheets Create a Google Sheet to manage your content pipeline with columns such as: idea / topic – Viral idea or source content analysis – AI analysis or hook summary script – Generated video script status – pending / processing / completed / failed video_url – Final rendered video link publish_result – Publishing status or notes Only rows marked as pending will be processed by the workflow. 2. Connect Google Sheets Authenticate your Google Sheets account in n8n Select the spreadsheet in the load and update nodes Ensure the workflow can write status updates back to the same sheet 3. Configure AI & Veo 3 Add credentials for your AI model (e.g. Gemini or similar) Configure prompt logic for health niche content Add your Veo 3 API credentials Test video creation with a small number of ideas before scaling 4. Configure Publishing & Schedule Set up publishing credentials for your target social platforms Open the Schedule triggers and define how often the workflow runs The schedule controls how frequently new AI health videos are created and published How to customize this workflow to your needs You can adapt this workflow without changing the core structure: Replace viral idea sources with your own research or internal data Adjust AI prompts for different health sub-niches Add manual approval steps before video creation Disable publishing and use the workflow only for video generation Add retry logic for failed renders or API errors Extend the workflow with analytics or performance tracking Best practices Start with a small batch of test ideas Keep status values consistent in Google Sheets Focus on strong hooks for health-related content Monitor rendering and publishing nodes during early runs Adjust schedule frequency based on API limits Documentation For a full walkthrough and advanced customization ideas, see the Video Guide.
by Yusuke
🧠 Overview Discover and analyze the most valuable community-built n8n workflows on GitHub. This automation searches public repositories, analyzes JSON workflows using AI, and saves a ranked report to Google Sheets — including summaries, use cases, difficulty, stars, node count, and repository links. ⚙️ How It Works Search GitHub Code API — queries for extension:json n8n and splits results Fetch & Parse — downloads each candidate file’s raw JSON and safely parses it Extract Metadata — detects AI-powered flows and collects key node information AI Analysis — evaluates the top N workflows (description, use case, difficulty) Merge Insights — combines AI analysis with GitHub data Save to Google Sheets — appends or updates by workflow name 🧩 Setup Instructions (5–10 min) Open Config node and set: search_query — e.g., "openai" extension:json n8n max_results — number of results to fetch (1–100) ai_analysis_top — number of workflows analyzed with AI SPREADSHEET_ID, SHEET_NAME — Google Sheets target Add GitHub PAT via HTTP Header Credential: Authorization: Bearer <YOUR_TOKEN> Connect OpenAI Credential to OpenAI Chat Model Connect Google Sheets (OAuth2) to Save to Google Sheets (Optional) Enable Schedule Trigger to run weekly for automatic updates > 💡 Tip: If you need to show literal brackets, use backticks like `<example>` (no HTML entities needed). 📚 Use Cases 1) Trend Tracking for AI Automations Goal:** Identify the fastest-growing AI-powered n8n workflows on GitHub. Output:** Sorted list by stars and AI detection, updated weekly. 2) Internal Workflow Benchmarking Goal:** Compare your organization’s workflows against top public examples. Output:** Difficulty, node count, and AI usage metrics in Google Sheets. 3) Market Research for Automation Agencies Goal:** Discover trending integrations and tool combinations (e.g., OpenAI + Slack). Output:** Data-driven insights for client projects and content planning. 🧪 Notes & Best Practices 🔐 No hardcoded secrets — use n8n Credentials 🧱 Works with self-hosted or cloud n8n 🧪 Start small (max_results = 10) before scaling 🧭 Use “AI Powered” + “Stars” columns in Sheets to identify top templates 🧩 Uses only Markdown sticky notes — no HTML formatting required 🔗 Resources GitHub (template JSON):** github-workflow-finder-ai.json
by Joe V
🔧 Setup Guide - Hiring Bot Workflow 📋 Prerequisites Before importing this workflow, make sure you have: ✅ n8n Instance (cloud or self-hosted) ✅ Telegram Bot Token (from @BotFather) ✅ OpenAI API Key (with GPT-4 Vision access) ✅ Gmail Account (with OAuth setup) ✅ Google Drive (to store your resume) ✅ Redis Instance (free tier available at Redis Cloud) 🚀 Step-by-Step Setup 1️⃣ Upload Your Resume to OpenAI First, you need to upload your resume to OpenAI's Files API: Upload your resume to OpenAI curl https://api.openai.com/v1/files \ -H "Authorization: Bearer YOUR_OPENAI_API_KEY" \ -F purpose="assistants" \ -F file="@/path/to/your/resume.pdf" Important: Save the file_id from the response (looks like file-xxxxxxxxxxxxx) Alternative: Use OpenAI Playground or Python: from openai import OpenAI client = OpenAI(api_key="YOUR_API_KEY") with open("resume.pdf", "rb") as file: response = client.files.create(file=file, purpose="assistants") print(f"File ID: {response.id}") 2️⃣ Upload Your Resume to Google Drive Go to Google Drive Upload your resume PDF Right-click → "Get link" → Copy the file ID from URL URL format: https://drive.google.com/file/d/FILE_ID_HERE/view Example ID: 1h79U8IFtI2dp_OBtnyhdGaarWpKb9qq9 3️⃣ Create a Telegram Bot Open Telegram and message @BotFather Send /newbot Choose a name and username Save the Bot Token (looks like 123456789:ABCdefGHIjklMNOpqrsTUVwxyz) (Optional) Set bot commands: /start - Start the bot /help - Get help 4️⃣ Set Up Redis Option A: Redis Cloud (Recommended - Free) Go to Redis Cloud Create a free account Create a database Note: Host, Port, Password Option B: Local Redis Docker docker run -d -p 6379:6379 redis:latest Or via package manager sudo apt-get install redis-server 5️⃣ Import the Workflow to n8n Open n8n Click "+" → "Import from File" Select Hiring_Bot_Anonymized.json Workflow will import with placeholder values 6️⃣ Configure Credentials A. Telegram Bot Credentials In n8n, go to Credentials → Create New Select "Telegram API" Enter your Bot Token from Step 3 Test & Save B. OpenAI API Credentials Go to Credentials → Create New Select "OpenAI API" Enter your OpenAI API Key Test & Save C. Redis Credentials Go to Credentials → Create New Select "Redis" Enter: Host: Your Redis host Port: 6379 (default) Password: Your Redis password Test & Save D. Gmail Credentials Go to Credentials → Create New Select "Gmail OAuth2 API" Follow OAuth setup flow Authorize n8n to access Gmail Test & Save E. Google Drive Credentials Go to Credentials → Create New Select "Google Drive OAuth2 API" Follow OAuth setup flow Authorize n8n to access Drive Test & Save 7️⃣ Update Node Values A. Update OpenAI File ID in "PayloadForReply" Node Double-click the "PayloadForReply" node Find this line in the code: const resumeFileId = "YOUR_OPENAI_FILE_ID_HERE"; Replace with your actual OpenAI file ID from Step 1: const resumeFileId = "file-xxxxxxxxxxxxx"; Save the node B. Update Google Drive File ID (Both "Download Resume" Nodes) There are TWO nodes that need updating: Node 1: "Download Resume" Double-click the node In the "File ID" field, click "Expression" Replace YOUR_GOOGLE_DRIVE_FILE_ID with your actual ID Update "Cached Result Name" to your resume filename Save Node 2: "Download Resume1" (same process) Double-click the node Update File ID Update filename Save 8️⃣ Assign Credentials to Nodes After importing, you need to assign your credentials to each node: Nodes that need credentials: | Node Name | Credential Type | |-----------|----------------| | Telegram Trigger | Telegram API | | Generating Reply | OpenAI API | | Store AI Reply | Redis | | GetValues | Redis | | Download Resume | Google Drive OAuth2 | | Download Resume1 | Google Drive OAuth2 | | Schedule Email | Gmail OAuth2 | | SendConfirmation | Telegram API | | Send a message | Telegram API | | Edit a text message | Telegram API | | Send a text message | Telegram API | | Send a chat action | Telegram API | How to assign: Click on each node In the "Credentials" section, select your saved credential Save the node 🧪 Testing the Workflow 1️⃣ Activate the Workflow Click the "Active" toggle in the top-right Workflow should now be listening for Telegram messages 2️⃣ Test with a Job Post Find a job post online (LinkedIn, Indeed, etc.) Take a screenshot Send it to your Telegram bot Bot should respond with: "Analyzing job post..." (typing indicator) Full email draft with confirmation button 3️⃣ Test Email Sending Click "Send The Email" button Check Gmail to verify email was sent Check if resume was attached 🐛 Troubleshooting Issue: "No binary image found" Solution:** Make sure you're sending an image file, not a document Issue: "Invalid resume file_id" Solution:** Check OpenAI file_id format (starts with file-) Verify file was uploaded successfully Make sure you updated the code in PayloadForReply node Issue: "Failed to parse model JSON" Solution:** Check OpenAI API quota/limits Verify model name is correct (gpt-5.2) Check if image is readable Issue: Gmail not sending Solution:** Re-authenticate Gmail OAuth Check Gmail permissions Verify "attachments" field is set to "Resume" Issue: Redis connection failed Solution:** Test Redis connection in credentials Check firewall rules Verify host/port/password Issue: Telegram webhook not working Solution:** Deactivate and reactivate workflow Check Telegram bot token is valid Make sure bot is not blocked 🔐 Security Best Practices Never share your credentials - Keep API keys private Use environment variables in n8n for sensitive data Set up Redis password - Don't use default settings Limit OAuth scopes - Only grant necessary permissions Rotate API keys regularly Monitor usage - Check for unexpected API calls 🎨 Customization Ideas Change AI Model In the PayloadForReply node, update: const MODEL = "gpt-5.2"; // Change to gpt-4, claude-3-opus, etc. Adjust Email Length Modify the system prompt: // From: Email body: ~120–180 words unless INSIGHTS specify otherwise. // To: Email body: ~100–150 words for concise applications. Add More Languages Update language detection logic in the system prompt to support more languages. Custom Job Filtering Edit the system prompt to target specific roles: // From: Only pick ONE job offer to process — the one most clearly related to Data roles // To: Only pick ONE job offer to process — the one most clearly related to [YOUR FIELD] Add Follow-up Reminders Add a "Wait" node after email sends to schedule a reminder after 7 days. 📊 Workflow Structure Telegram Input ↓ Switch (Route by type) ↓ ├─ New Job Post │ ↓ │ Send Chat Action (typing...) │ ↓ │ PayloadForReply (Build AI request) │ ↓ │ Generating Reply (Call OpenAI) │ ↓ │ FormatAiReply (Parse JSON) │ ↓ │ Store AI Reply (Redis cache) │ ↓ │ SendConfirmation (Show preview) │ └─ Callback (User clicked "Send") ↓ GetValues (Fetch from Redis) ↓ Format Response ↓ Download Resume (from Drive) ↓ ├─ Path A: Immediate Send │ ↓ │ Send Confirmation Message │ ↓ │ Edit Message (update status) │ └─ Path B: Scheduled Send ↓ Wait (10 seconds) ↓ Download Resume Again ↓ Schedule Email (Gmail) ↓ Send Success Message 💡 Tips for Best Results High-Quality Resume: Upload a well-formatted PDF resume Clear Screenshots: Take clear, readable job post screenshots Use Captions: Add instructions via Telegram captions Example: "make it more casual" Example: "send to recruiter@company.com" Review Before Sending: Always read the draft before clicking send Update Resume Regularly: Keep your Google Drive resume current Test First: Try with a few test jobs before mass applying 🆘 Need Help? 📚 n8n Documentation 💬 n8n Community Forum 📺 n8n YouTube Channel 🤖 OpenAI Documentation 📱 Telegram Bot API Docs 📝 Checklist Use this checklist to verify your setup: [ ] OpenAI resume file uploaded (got file_id) [ ] Google Drive resume uploaded (got file ID) [ ] Telegram bot created (got bot token) [ ] Redis instance created (got credentials) [ ] All n8n credentials created and tested [ ] PayloadForReply node updated with OpenAI file_id [ ] Both Download Resume nodes updated with Drive file_id [ ] All nodes have credentials assigned [ ] Workflow activated [ ] Test message sent successfully [ ] Test email received successfully 🎉 You're all set! Start applying to jobs in 10 seconds! Made with ❤️ and n8n