by Wolfgang Renner
This template is a fully automated AI invoice processing workflow for n8n. It watches a Google Drive folder for new invoice PDFs, extracts all key information using an AI Agent, assigns the correct booking account, saves the renamed invoice in the right Drive folder, and updates your Google Sheets booking list. A perfect starter template if you want to build your own AI-powered accounting automation. What this workflow does Monitors a Google Drive folder for new invoice PDFs. Downloads and extracts invoice text from the uploaded PDF. Uses an AI Agent (OpenAI + Structured Output Parser) to extract: invoice date vendor currency total amount invoice number booking text booking account matching Google Drive folder ID Automatically renames the PDF to a clean, consistent format (e.g. 250912 Vendor.pdf). Saves the invoice into the correct accounting folder in Google Drive. Updates your booking list in Google Sheets with all extracted fields. Moves the processed invoice to an output folder to avoid duplicates. Everything runs hands-free after setup. Key features 🧠 AI Invoice Reading using OpenAI + LangChain 📑 Structured Output Parser guarantees clean, validated fields 📁 Automated Google Drive File Routing 📊 Google Sheets logging for accounting records 🔄 File movement logic to keep input/output folders organized ⚙️ Chart of Accounts integration from your Google Sheet 🟦 Works out of the box with Invoice Agent – Folder Structure Setup (recommended) Typical use cases Automated accounting workflows Pre-processing invoices before importing into ERP or sevDesk AI-powered invoice extraction for small businesses or freelancers Structured archiving of invoices for tax and audit requirements Fully automated Google Drive invoice inbox How to use this template Connect your Google Drive & Sheets credentials in all relevant nodes. Select your: • Input folder (where invoices are uploaded) • Output folder (where processed invoices go) • Folder structure sheet + booking accounts sheet Upload any invoice PDF into the input folder. The workflow starts automatically and processes the invoice end-to-end. ⸻ Requirements • Google Drive OAuth2 • Google Sheets OAuth2 • OpenAI API key • A Google Sheet containing your chart of accounts • A prepared folder structure (use the “Google Drive Structure Setup” template)
by Nguyen Thieu Toan
🤖 Facebook Messenger Smart Chatbot – Batch, Format & Notify with n8n Data Table by Nguyen Thieu Toan 🌟 What Is This Workflow? This is a smart chatbot solution built with n8n, designed to integrate seamlessly with Facebook Messenger. It batches incoming messages, formats them for clarity, tracks conversation history, and sends natural replies using AI. Perfect for businesses, customer support, or personal AI agents. ⚙️ Key Features 🔄 Smart batching: Groups consecutive user messages to process them in one go, avoiding fragmented replies. 🧠 Context formatting: Automatically formats messages to fit Messenger’s structure and length limits. 📋 Conversation history tracking: Stores and retrieves chat logs between user and bot using n8n Data Table. 👀 Seen & Typing effects: Adds human-like responsiveness with Messenger’s sender actions. 🧩 AI Agent integration: Easily connects to GPT, Gemini, or any LLM for natural replies, scheduling, or business logic. 🚀 How It Works Connects to your Facebook Page via webhook to receive and send messages. Stores incoming messages in a Data Table called Batch_messages, including fields like user_text, bot_rep, processed, etc. Collects unprocessed messages, sorts them by id, and creates a merged_message and full history. Sends the history to an AI Agent for contextual response generation. Sends the AI reply back to Messenger with Seen/Typing effects. Updates the message status to processed = true to prevent duplicate handling. 🛠️ Setup Guide Create a Facebook App and Messenger webhook, link it to your Page. Set up the Batch_messages Data Table in n8n with required columns. Import the workflow or build nodes manually using the tutorial. Configure your API tokens, webhook URLs, and AI Agent endpoint. Deploy the workflow on a public n8n server. 📘 Full tutorial available at: 👉 Smart Chatbot Workflow Guide by Nguyen Thieu Toan 💡 Pro Tips Customize the AI prompt and persona to match your business tone. Add scheduling, lead capture, or CRM integration using n8n’s flexible nodes. Monitor your Data Table regularly to ensure clean message flow and batching. 👤 About the Creator Nguyen Thieu Toan (Nguyễn Thiệu Toàn/Jay Nguyen) is an expert in AI automation, business optimization, and chatbot development. With a background in marketing and deep knowledge of n8n workflows, Jay helps businesses harness AI to save time, boost performance, and deliver smarter customer experiences. Website: https://nguyenthieutoan.com
by Vitorio Magalhães
🎯 What this workflow does This workflow automatically monitors Reddit subreddits for new image posts and downloads them to Google Drive. It's perfect for content creators, meme collectors, or anyone who wants to automatically archive images from their favorite subreddits without manual work. The workflow intelligently prevents duplicate downloads by checking existing files in Google Drive and sends you Telegram notifications about the download status, so you always know when new content has been saved. 🚀 Key Features Multi-subreddit monitoring**: Configure multiple subreddits to monitor simultaneously Smart duplicate detection**: Never downloads the same image twice Automated scheduling**: Runs on a customizable cron schedule Real-time notifications**: Get instant Telegram updates about download activity Rate limit friendly**: Built-in delays to respect Reddit's API limits Cloud storage integration**: Direct upload to organized Google Drive folders 📋 Prerequisites Before using this workflow, you'll need: Reddit Developer Account**: Create an app at reddit.com/prefs/apps Google Cloud Project**: With Drive API enabled and OAuth2 credentials Telegram Bot**: Created via @BotFather with your chat ID Basic n8n knowledge**: Understanding of credentials and node configuration ⚙️ Setup Instructions 1. Configure Reddit API Access Visit reddit.com/prefs/apps and create a new "script" type application Note your Client ID and Client Secret Add Reddit OAuth2 credentials in n8n 2. Set up Google Drive Integration Enable Google Drive API in Google Cloud Console Create OAuth2 credentials with appropriate scopes Configure Google Drive OAuth2 credentials in n8n Update the folder ID in the workflow to your desired destination 3. Configure Telegram Notifications Create a bot via @BotFather on Telegram Get your chat ID (message @userinfobot) Add Telegram API credentials in n8n 4. Customize Your Settings Update the Settings node with: Your Telegram chat ID List of subreddits to monitor (e.g., ['memes', 'funny', 'pics']) Optional: Adjust wait time between requests Optional: Modify the cron schedule 🔄 How it works Scheduled Trigger: The workflow starts automatically based on your cron configuration Random Selection: Picks a random subreddit from your configured list Fetch Posts: Retrieves the latest 30 posts from the subreddit's "new" section Image Filtering: Keeps only posts with i.redd.it image URLs Duplicate Check: Searches Google Drive to avoid re-downloading existing images Download & Upload: Downloads new images and uploads them to your Drive folder Notification: Sends a Telegram message with the download summary 🛠️ Customization Options Scheduling Modify the cron trigger to run hourly, daily, or at custom intervals Add timezone considerations for your location Content Filtering Add upvote threshold filters to get only popular content Filter by image dimensions or file size Implement NSFW content filtering Storage & Organization Create subfolders by subreddit Add date-based folder organization Implement file naming conventions Notifications & Monitoring Add Discord webhook notifications Create download statistics tracking Log failed downloads for debugging 📊 Use Cases Content Creators**: Automatically collect memes and trending images for social media Digital Marketers**: Monitor visual trends across different communities Researchers**: Archive visual content from specific subreddits for analysis Personal Use**: Build a curated collection of images from your favorite subreddits 🎯 Best Practices Respect Rate Limits**: Keep the wait time between requests to avoid being blocked Monitor Storage**: Regularly check Google Drive storage usage Subreddit Selection**: Choose active subreddits with regular image posts Credential Security**: Use n8n's credential system and never hardcode API keys 🚨 Important Notes This workflow only downloads images from i.redd.it (Reddit's image host) Some subreddits may have bot restrictions Reddit's API has rate limits (~60 requests per minute) Ensure your Google Drive has sufficient storage space Always comply with Reddit's Terms of Service and content policies
by Daniel Shashko
This workflow automates the creation of user-generated-content-style product videos by combining Gemini's image generation with OpenAI's SORA 2 video generation. It accepts webhook requests with product descriptions, generates images and videos, stores them in Google Drive, and logs all outputs to Google Sheets for easy tracking. Main Use Cases Automate product video creation for e-commerce catalogs and social media. Generate UGC-style content at scale without manual design work. Create engaging video content from simple text prompts for marketing campaigns. Build a centralized library of product videos with automated tracking and storage. How it works The workflow operates as a webhook-triggered process, organized into these stages: Webhook Trigger & Input Accepts POST requests to the /create-ugc-video endpoint. Required payload includes: product prompt, video prompt, Gemini API key, and OpenAI API key. Image Generation (Gemini) Sends the product prompt to Google's Gemini 2.5 Flash Image model. Generates a product image based on the description provided. Data Extraction Code node extracts the base64 image data from Gemini's response. Preserves all prompts and API keys for subsequent steps. Video Generation (SORA 2) Sends the video prompt to OpenAI's SORA 2 API. Initiates video generation with specifications: 720x1280 resolution, 8 seconds duration. Returns a video generation job ID for polling. Video Status Polling Continuously checks video generation status via OpenAI API. If status is "completed": proceeds to download. If status is still processing: waits 1 minute and retries (polling loop). Video Download & Storage Downloads the completed video file from OpenAI. Uploads the MP4 file to Google Drive (root folder). Generates a shareable Google Drive link. Logging to Google Sheets Records all generation details in a tracking spreadsheet: Product description Video URL (Google Drive link) Generation status Timestamp Summary Flow: Webhook Request → Generate Product Image (Gemini) → Extract Image Data → Generate Video (SORA 2) → Poll Status → If Complete: Download Video → Upload to Google Drive → Log to Google Sheets → Return Response If Not Complete: Wait 1 Minute → Poll Status Again Benefits: Fully automated video creation pipeline from text to finished product. Scalable solution for generating multiple product videos on demand. Combines cutting-edge AI models (Gemini + SORA 2) for high-quality output. Centralized storage in Google Drive with automatic logging in Google Sheets. Flexible webhook interface allows integration with any application or service. Retry mechanism ensures videos are captured even with longer processing times. Created by Daniel Shashko
by Khair Ahammed
Meet Troy, your intelligent personal assistant that seamlessly manages your Google Calendar and Tasks through Telegram. This workflow combines AI-powered natural language processing with MCP (Model Context Protocol) integration to provide a conversational interface for scheduling meetings, managing tasks, and organizing your digital life. Key Features 📅 Smart Calendar Management Create single and recurring events with conflict detection Support for multiple attendees (1-2 attendee variants) Automatic time zone handling (Bangladesh Standard Time) Weekly recurring event scheduling Event retrieval, updates, and deletion ✅ Task Management Create, update, and delete tasks in Google Tasks Mark tasks as completed Retrieve task lists with completion status Task repositioning and organization Parent-child task relationships 🤖Intelligent Processing Natural language understanding for scheduling requests Automatic conflict detection before event creation Context-aware responses with conversation memory Error handling with fallback messages 📱 Telegram Interface Real-time chat interaction Simple commands and natural language Instant confirmations and updates Error notifications Workflow Components Core Architecture: Telegram Trigger for user messages AI Agent with GPT-4o-mini processing MCP Client Tools for Google services Conversation memory for context Error handling with backup responses MCP Integrations: Google Calendar MCP Server (6 specialized tools) Google Tasks MCP Server (5 task operations) Custom HTTP tool for advanced task positioning Use Cases Calendar Scenarios: "Schedule a meeting tomorrow at 3 PM with john@example.com" "Set up weekly team standup every Monday at 10 AM" "Check my calendar for conflicts this afternoon" "Delete the meeting with ID xyz123" Task Management: "Add a task to buy groceries" "Mark the project report task as completed" "Update my presentation task due date to Friday" "Show me all pending tasks" Setup Requirements Required Credentials: Google Calendar OAuth2 Google Tasks OAuth2 OpenAI API key Telegram Bot token ** MCP Configuration:** Two MCP server endpoints for Google services Proper webhook configurations SSL-enabled n8n instance for MCP triggers Business Benefits Productivity: Voice-to-action task and calendar management *Efficiency: *Eliminate app switching with chat interface Intelligence: AI prevents scheduling conflicts automatically Accessibility: Simple Telegram commands for complex operations Technical Specifications Components: 1 Telegram trigger 1 AI Agent with memory 2 MCP triggers (Calendar & Tasks) 13 Google service tools Error handling flows Response Time: Sub-second for most operations *Memory: *Session-based conversation context Timezone: Automatic Bangladesh Standard Time conversion This personal assistant transforms how you interact with Google services, making scheduling and task management as simple as sending a text message to Troy on Telegram. Tags: personal-assistant, mcp-integration, google-calendar, google-tasks, telegram-bot, ai-agent, productivity
by Jainik Sheth
What is this? This RAG workflow allows you to build a smart chat assistant that can answer user questions based on any collection of documents you provide. It automatically imports and processes files from Google Drive, stores their content in a searchable vector database, and retrieves the most relevant information to generate accurate, context-driven responses. The workflow manages chat sessions and keeps the document database current, making it adaptable for use cases like customer support, internal knowledge bases, or HR assistant etc. How it works 1. Chat RAG Agent Uses OpenAI for responses, referencing only specific data from the vector store (data that is uploaded on google drive folder). Maintains chat history in Postgres using a session key from the chat input. 2. Data Pipeline (File Ingestion) Monitors Google Drive for new/updated files and automatically updates them in vector store Downloads, extracts, and processes file content (PDFs, Google Docs). Generates embeddings and stores them in the Supabase vector store for retrieval. 3. Vector Store Cleanup Scheduled and manual routines to remove duplicate or outdated entries from the Supabase vector store. Ensures only the latest and unique documents are available for retrieval. 4. File Management Handles folder and file creation, upload, and metadata assignment in Google Drive. Ensures files are organized and linked with their corresponding vector store entries. Getting Started Create and connect all relevant credentials Google Drive Postgres Supabase OpenAI Run the table creation nodes first to set up your database tables in Postgres Upload your documents through Google Drive (or swap out for a different file storage solution) The agent will process them automatically (chunking text, storing tabular data in Postgres) Start asking questions that leverage the agent's multiple reasoning approaches Customization (optional) This template provides a solid foundation that you can extend by: Tuning the system prompt for your specific use case Adding document metadata like summaries Implementing more advanced RAG techniques Optimizing for larger knowledge bases Note, if you're using a different nodes eg. file storage, vector store etc the integration may vary a little Prerequisites Google account (google drive) Supabase account OpenAI APIs Postgres account
by Omer Fayyaz
An automated PDF download and management system that collects PDFs from URLs, uploads them to Google Drive, extracts metadata, and maintains a searchable library with comprehensive error handling and status tracking. What Makes This Different: Intelligent URL Validation** - Validates PDF URLs before attempting download, extracting filenames from URLs and generating fallback names when needed, preventing wasted processing time Binary File Handling** - Properly handles PDF downloads as binary files with appropriate headers (User-Agent, Accept), ensuring compatibility with various PDF hosting services Comprehensive Error Handling** - Three-tier error handling: invalid URLs are marked immediately, failed downloads are logged with error messages, and all errors are tracked in a dedicated Error Log sheet Metadata Extraction** - Automatically extracts file ID, size, MIME type, Drive view links, and download URLs from Google Drive responses, creating a complete file record Multiple Trigger Options** - Supports manual execution, scheduled runs (every 12 hours), and workflow-to-workflow calls, making it flexible for different automation scenarios Status Tracking** - Updates source spreadsheet with processing status (Downloaded, Failed, Invalid), enabling easy monitoring and retry logic for failed downloads Key Benefits of Automated PDF Management: Centralized Storage** - All PDFs are automatically organized in a Google Drive folder, making them easy to find and share across your organization Searchable Library** - Metadata is stored in Google Sheets with file links, titles, sources, and download dates, enabling quick searches and filtering Error Recovery** - Failed downloads are logged with error messages, allowing you to identify and fix issues (broken links, access permissions, etc.) and retry Automated Processing** - Schedule-based execution keeps your PDF library updated without manual intervention, perfect for monitoring research sources Integration Ready** - Can be called by other workflows, enabling complex automation chains (e.g., scrape URLs → download PDFs → process content) Bulk Processing** - Processes multiple PDFs in sequence from a spreadsheet, handling large batches efficiently with proper error isolation Who's it for This template is designed for researchers, academic institutions, market research teams, legal professionals, compliance officers, and anyone who needs to systematically collect and organize PDF documents from multiple sources. It's perfect for organizations that need to build research libraries, archive regulatory documents, collect industry reports, maintain compliance documentation, or aggregate academic papers without manually downloading and organizing each file. How it works / What it does This workflow creates a PDF collection and management system that reads PDF URLs from Google Sheets, downloads the files, uploads them to Google Drive, extracts metadata, and maintains a searchable library. The system: Reads Pending PDF URLs - Fetches PDF URLs from Google Sheets "PDF URLs" sheet, processing entries that need to be downloaded Loops Through PDFs - Processes PDFs one at a time using Split in Batches, ensuring proper error isolation and preventing batch failures Prepares Download Info - Extracts filename from URL, decodes URL-encoded characters, validates PDF URL format, and generates fallback filenames with timestamps if needed Validates URL - Checks if URL is valid before attempting download, skipping invalid entries immediately Downloads PDF - Makes HTTP request with proper browser headers, downloads PDF as binary file with 60-second timeout, handles download errors gracefully Verifies Download - Checks if binary data was successfully received, routing to error handling if download failed Uploads to Google Drive - Uploads PDF file to specified Google Drive folder, preserving original filename or using generated name Extracts File Metadata - Extracts file ID, name, MIME type, file size, Drive view link, and download link from Google Drive API response Saves to PDF Library - Appends file metadata to Google Sheets "PDF Library" sheet with title, source, file links, and download timestamp Updates Source Status - Marks processed URLs as "Downloaded", "Failed", or "Invalid" in source sheet for tracking Logs Errors - Records failed downloads and invalid URLs in "Error Log" sheet with error messages for troubleshooting Tracks Completion - Generates completion summary with processing statistics and timestamp Key Innovation: Error-Resilient Processing - Unlike simple download scripts that fail on the first error, this workflow isolates failures, continues processing remaining PDFs, and provides detailed error logging. This ensures maximum success rate and makes troubleshooting straightforward. How to set up 1. Prepare Google Sheets Create a Google Sheet with three tabs: "PDF URLs", "PDF Library", and "Error Log" In "PDF URLs" sheet, create columns: PDF_URL (or pdf_url), Title (optional), Source (optional), Status (optional - will be updated by workflow) Add sample PDF URLs in the PDF_URL column (e.g., direct links to PDF files) The "PDF Library" sheet will be automatically populated with columns: pdfUrl, title, source, fileName, fileId, mimeType, fileSize, driveUrl, downloadUrl, downloadedAt, status The "Error Log" sheet will record: status, errorMessage, pdfUrl, title (for failed downloads) Verify your Google Sheets credentials are set up in n8n (OAuth2 recommended) 2. Configure Google Sheets Nodes Open the "Read Pending PDF URLs" node and select your spreadsheet from the document dropdown Set sheet name to "PDF URLs" Configure the "Save to PDF Library" node: select same spreadsheet, set sheet name to "PDF Library", operation should be "Append or Update" Configure the "Update Source Status" node: same spreadsheet, "PDF URLs" sheet, operation "Update" Configure the "Log Error" node: same spreadsheet, "Error Log" sheet, operation "Append or Update" Test connection by running the "Read Pending PDF URLs" node manually to verify it can access your sheet 3. Set Up Google Drive Folder Create a folder in Google Drive where you want PDFs stored (e.g., "PDF Reports" or "Research Library") Open the "Upload to Google Drive" node Select your Google Drive account (OAuth2 credentials) Choose the drive (usually "My Drive") Select the folder you created from the folder dropdown The filename will be automatically extracted from the URL or generated with timestamp Verify folder permissions allow the service account to upload files Test by manually uploading a file to ensure access works 4. Configure Download Settings The "Download PDF" node is pre-configured with appropriate headers and 60-second timeout If you encounter timeout issues with large PDFs, increase timeout in the node options The User-Agent header is set to mimic a browser to avoid blocking Accept header is set to application/pdf,application/octet-stream,/ for maximum compatibility For sites requiring authentication, you may need to add additional headers or use cookies Test with a sample PDF URL to verify download works correctly 5. Set Up Scheduling & Test The workflow includes Manual Trigger (for testing), Schedule Trigger (runs every 12 hours), and Execute Workflow Trigger (for calling from other workflows) To customize schedule: Open "Schedule (Every 12 Hours)" node and adjust interval (e.g., daily, weekly) For initial testing: Use Manual Trigger, add 2-3 test PDF URLs to your "PDF URLs" sheet Verify execution: Check that PDFs are downloaded, uploaded to Drive, and metadata saved to "PDF Library" Monitor execution logs: Check for any download failures, timeout issues, or Drive upload errors Review Error Log sheet: Verify failed downloads are properly logged with error messages Common issues: Invalid URLs (check URL format), access denied (check file permissions), timeout (increase timeout for large files), Drive quota (check Google Drive storage) Requirements Google Sheets Account** - Active Google account with OAuth2 credentials configured in n8n for reading and writing spreadsheet data Google Drive Account** - Same Google account with OAuth2 credentials and sufficient storage space for PDF files Source Spreadsheet** - Google Sheet with "PDF URLs", "PDF Library", and "Error Log" tabs, properly formatted with required columns Valid PDF URLs** - Direct links to PDF files (not HTML pages that link to PDFs) - URLs should end in .pdf or point directly to PDF content n8n Instance** - Self-hosted or cloud n8n instance with access to external websites (HTTP Request node needs internet connectivity to download PDFs)
by Amine ARAGRAG
This n8n template automates the collection and enrichment of Product Hunt posts using AI and Google Sheets. It fetches new tools daily, translates content, categorizes them intelligently, and saves everything into a structured spreadsheet—ideal for building directories, research dashboards, newsletters, or competitive intelligence assets. Good to know Sticky notes inside the workflow explain each functional block and required configurations. Uses cursor-based pagination to safely fetch Product Hunt data. AI agent handles translation, documentation generation, tech extraction, and function area classification. Category translations are synced with a Google Sheets dictionary to avoid duplicates. All enriched entries are stored in a clean “Tools” sheet for easy filtering or reporting. How it works A schedule trigger starts the workflow daily. Product Hunt posts are retrieved via GraphQL and processed in batches. A code node restructures each product into a consistent schema. The workflow checks if a product already exists in Google Sheets. For new items, the AI agent generates metadata, translations, and documentation. Categories are matched or added to a Google Sheets dictionary. The final enriched product entry is appended or updated in the spreadsheet. Pagination continues until no next page remains. How to use Connect Product Hunt OAuth2, Google Sheets, and OpenAI credentials. Adjust the schedule trigger to your preferred frequency. Optionally expand enrichment fields (tags, scoring, custom classifications). Replace the trigger with a webhook or manual trigger if needed. Requirements Product Hunt OAuth2 credentials Google Sheets account OpenAI (or compatible) API access Customising this workflow Add Slack or Discord notifications for new tools. Push enriched data to Airtable, Notion, or a database. Extend AI enrichment with summaries or SEO fields. Use the Google Sheet as a backend for dashboards or frontend applications.
by Rahul Joshi
Description Automatically extract a structured skill matrix from PDF resumes in a Google Drive folder and store results in Google Sheets. Uses Azure OpenAI (GPT-4o-mini) to analyze predefined tech stacks and filters for relevant proficiency. Fast, consistent insights ready for review. 🔍📊 What This Template Does Fetches all resumes from a designated Google Drive folder (“Resume_store”). 🗂️ Downloads each resume file securely via Google Drive API. ⬇️ Extracts text from PDF files for analysis. 📄➡️📝 Analyzes skills with Azure OpenAI (GPT-4o-mini), rating 1–5 and estimating years. 🤖 Parses and filters to include only skills with proficiency > 2, then updates Google Sheets (“Resume store” → “Sheet2”). ✅ Key Benefits Saves hours on manual resume screening. ⏱️ Produces a consistent, structured skill matrix. 📐 Focuses on intermediate to expert skills for faster shortlisting. 🎯 Centralizes candidate data in Google Sheets for easy sharing. 🗃️ Features Predefined tech stack focus: React, Node.js, Angular, Python, Java, SQL, Docker, Kubernetes, AWS, Azure, GCP, HTML, CSS, JavaScript. 🧰 Proficiency scoring (1–5) and estimated years of experience. 📈 PDF-to-text extraction for robust parsing. 🧾 JSON parsing with error handling for invalid outputs. 🛡️ Manual Trigger to run on demand. ▶️ Requirements n8n instance (cloud or self-hosted). Google Drive access with credentials to the “Resume_store” folder. Google Sheets access to the “Resume store” spreadsheet and “Sheet2” tab. Azure OpenAI with GPT-4o-mini deployed and connected via secure credentials. PDF text extraction enabled within n8n. Target Audience HR and Talent Acquisition teams. 👥 Recruiters and staffing agencies. 🧑💼 Operations teams managing hiring pipelines. 🧭 Tech hiring managers seeking consistent skill insights. 💡 Step-by-Step Setup Instructions Place candidate resumes (PDF) into Google Drive → “Resume_store”. In n8n, add Google Drive and Google Sheets credentials and authorize access. In n8n, add Azure OpenAI credentials (GPT-4o-mini deployment). Import the workflow, assign credentials to each node, and confirm folder/sheet names. Run the Manual Trigger to execute the flow and verify data in “Resume store” → “Sheet2”.
by vinci-king-01
Software Vulnerability Patent Tracker ⚠️ COMMUNITY TEMPLATE DISCLAIMER: This is a community-contributed template that uses ScrapeGraphAI (a community node). Please ensure you have the ScrapeGraphAI community node installed in your n8n instance before using this template. This workflow automatically tracks newly-published patent filings that mention software-security vulnerabilities, buffer-overflow mitigation techniques, and related technology keywords. Every week it aggregates fresh patent data from USPTO and international patent databases, filters it by relevance, and delivers a concise JSON digest (and optional Intercom notification) to R&D teams and patent attorneys. Pre-conditions/Requirements Prerequisites n8n instance (self-hosted or n8n cloud, v1.7.0+) ScrapeGraphAI community node installed Basic understanding of patent search syntax (for customizing keyword sets) Optional: Intercom account for in-app alerts Required Credentials | Credential | Purpose | |------------|---------| | ScrapeGraphAI API Key | Enables ScrapeGraphAI nodes to fetch and parse patent-office webpages | | Intercom Access Token (optional) | Sends weekly digests directly to an Intercom workspace | Additional Setup Requirements | Setting | Recommended Value | Notes | |---------|-------------------|-------| | Cron schedule | 0 9 * * 1 | Triggers every Monday at 09:00 server time | | Patent keyword matrix | See example CSV below | List of comma-separated keywords per tech focus | Example keyword matrix (upload as keywords.csv or paste into the “Matrix” node): topic,keywords Buffer Overflow,"buffer overflow, stack smashing, stack buffer" Memory Safety,"memory safety, safe memory allocation, pointer sanitization" Code Injection,"SQL injection, command injection, injection prevention" How it works This workflow automatically tracks newly-published patent filings that mention software-security vulnerabilities, buffer-overflow mitigation techniques, and related technology keywords. Every week it aggregates fresh patent data from USPTO and international patent databases, filters it by relevance, and delivers a concise JSON digest (and optional Intercom notification) to R&D teams and patent attorneys. Key Steps: Schedule Trigger**: Fires weekly based on the configured cron expression. Matrix (Keyword Loader)**: Loads the CSV-based technology keyword matrix into memory. Code (Build Search Queries)**: Dynamically assembles patent-search URLs for each keyword group. ScrapeGraphAI (Fetch Results)**: Scrapes USPTO, EPO, and WIPO result pages and parses titles, abstracts, publication numbers, and dates. If (Relevance Filter)**: Removes patents older than 1 year or without vulnerability-related terms in the abstract. Set (Normalize JSON)**: Formats the remaining records into a uniform JSON schema. Intercom (Notify Team)**: Sends a summarized digest to your chosen Intercom workspace. (Skip or disable this node if you prefer to consume the raw JSON output instead.) Sticky Notes**: Contain inline documentation and customization tips for future editors. Set up steps Setup Time: 10-15 minutes Install Community Node Navigate to “Settings → Community Nodes”, search for ScrapeGraphAI, and click “Install”. Create Credentials Go to “Credentials” → “New Credential” → select ScrapeGraphAI API → paste your API key. (Optional) Add an Intercom credential with a valid access token. Import the Workflow Click “Import” → “Workflow JSON” and paste the template JSON, or drag-and-drop the .json file. Configure Schedule Open the Schedule Trigger node and adjust the cron expression if a different frequency is required. Upload / Edit Keyword Matrix Open the Matrix node, paste your custom CSV, or modify existing topics & keywords. Review Search Logic In the Code (Build Search Queries) node, review the base URLs and adjust patent databases as needed. Define Notification Channel If using Intercom, select your Intercom credential in the Intercom node and choose the target channel. Execute & Activate Click “Execute Workflow” for a trial run. Verify the output. If satisfied, switch the workflow to “Active”. Node Descriptions Core Workflow Nodes: Schedule Trigger** – Initiates the workflow on a weekly cron schedule. Matrix** – Holds the CSV keyword table and makes each row available as an item. Code (Build Search Queries)** – Generates search URLs and attaches meta-data for later nodes. ScrapeGraphAI** – Scrapes patent listings and extracts structured fields (title, abstract, pub. date, link). If (Relevance Filter)** – Applies date and keyword relevance filters. Set (Normalize JSON)** – Maps scraped fields into a clean JSON schema for downstream use. Intercom** – Sends formatted patent summaries to an Intercom inbox or channel. Sticky Notes** – Provide inline documentation and edit history markers. Data Flow: Schedule Trigger → Matrix → Code → ScrapeGraphAI → If → Set → Intercom Customization Examples Change Data Source to Google Patents // In the Code node const base = 'https://patents.google.com/?q='; items.forEach(item => { item.json.searchUrl = ${base}${encodeURIComponent(item.json.keywords)}&oq=${encodeURIComponent(item.json.keywords)}; }); return items; Send Digest via Slack Instead of Intercom // Replace Intercom node with Slack node { "text": 🚀 New Vulnerability-related Patents (${items.length})\n + items.map(i => • <${i.json.link}|${i.json.title}>).join('\n') } Data Output Format The workflow outputs structured JSON data: { "topic": "Memory Safety", "keywords": "memory safety, safe memory allocation, pointer sanitization", "title": "Memory protection for compiled binary code", "publicationNumber": "US20240123456A1", "publicationDate": "2024-03-21", "abstract": "Techniques for enforcing memory safety in compiled software...", "link": "https://patents.google.com/patent/US20240123456A1/en", "source": "USPTO" } Troubleshooting Common Issues Empty Result Set – Ensure that the keywords are specific but not overly narrow; test queries manually on USPTO. ScrapeGraphAI Timeouts – Increase the timeout parameter in the ScrapeGraphAI node or reduce concurrent requests. Performance Tips Limit the keyword matrix to <50 rows to keep weekly runs under 2 minutes. Schedule the workflow during off-peak hours to reduce load on patent-office servers. Pro Tips: Combine this workflow with a vector database (e.g., Pinecone) to create a semantic patent knowledge base. Add a “Merge” node to correlate new patents with existing vulnerability CVE entries. Use a second ScrapeGraphAI node to crawl citation trees and identify emerging technology clusters.
by Lidia
Who’s it for Teams who want to automatically generate structured meeting minutes from uploaded transcripts and instantly share them in Slack. Perfect for startups, project teams, or any company that collects meeting transcripts in Google Drive. How it works / What it does This workflow automatically turns raw meeting transcripts into well-structured minutes in Markdown and posts them to Slack: Google Drive Trigger – Watches a specific folder. Any new transcript file added will start the workflow. Download File – Grabs the transcript. Prep Transcript – Converts the file into plain text and passes the transcript downstream. Message a Model – Sends the transcript to OpenAI GPT for summarization using a structured system prompt (action items, decisions, N/A placeholders). Make Minutes – Formats GPT’s response into a Markdown file. Slack: Send a message – Posts a Slack message announcing the auto-generated minutes. Slack: Upload a file – Uploads the full Markdown minutes file into the chosen Slack channel. End result: your Slack channel always has clear, standardized minutes right after a meeting. How to set up Google Drive Create a folder where you’ll drop transcript files. Configure the folder ID in the Google Drive Trigger node. OpenAI Add your OpenAI API credentials in the Message a Model node. Select a supported GPT model (e.g., gpt-4o-mini or gpt-4). Slack Connect your Slack account and set the target channel ID in the Slack nodes. Run the workflow and drop a transcript file into Drive. Minutes will appear in Slack automatically. Requirements Google Drive account (for transcript upload) OpenAI API key (for text summarization) Slack workspace (for message posting and file upload) How to customize the workflow Change summary structure*: Adjust the system prompt inside *Message a Model (e.g., shorter summaries, language other than English). Different output format*: Modify *Make Minutes to output plain text, PDF, or HTML instead of Markdown. New destinations**: Add more nodes to send minutes to email, Notion, or Confluence in parallel. Multiple triggers**: Replace Google Drive trigger with Webhook if you want to integrate with Zoom or MS Teams transcript exports. Good to know OpenAI API calls are billed separately. See OpenAI pricing. Files must be text-based (.txt or .md). For PDFs or docs, add a conversion step before summarization. Slack requires the bot user to be a member of the target channel, otherwise you’ll see a not_in_channel error.
by iamvaar
Youtube Video: https://youtu.be/dEtV7OYuMFQ?si=fOAlZWz4aDuFFovH Workflow Pre-requisites Step 1: Supabase Setup First, replace the keys in the "Save the embedding in DB" & "Search Embeddings" nodes with your new Supabase keys. After that, run the following code snippets in your Supabase SQL editor: Create the table to store chunks and embeddings: CREATE TABLE public."RAG" ( id bigserial PRIMARY KEY, chunk text NULL, embeddings vector(1024) NULL ) TABLESPACE pg_default; Create a function to match embeddings: DROP FUNCTION IF EXISTS public.matchembeddings1(integer, vector); CREATE OR REPLACE FUNCTION public.matchembeddings1( match_count integer, query_embedding vector ) RETURNS TABLE ( chunk text, similarity float ) LANGUAGE plpgsql AS $$ BEGIN RETURN QUERY SELECT R.chunk, 1 - (R.embeddings <=> query_embedding) AS similarity FROM public."RAG" AS R ORDER BY R.embeddings <=> query_embedding LIMIT match_count; END; $$; Step 2: Create Jotform with these fields Your full name email address Upload PDF Document [field where you upload the knowledgebase in PDF] Step 3: Get Together AI API Key Get a Together AI API key and paste it into the "Embedding Uploaded document" node and the "Embed User Message" node. Here is a detailed, node-by-node explanation of the n8n workflow, which is divided into two main parts. Part 1: Ingesting Knowledge from a PDF This first sequence of nodes runs when you submit a PDF through a Jotform. Its purpose is to read the document, process its content, and save it in a specialized database for the AI to use later. JotForm Trigger Type: Trigger What it does: This node starts the entire workflow. It's configured to listen for new submissions on a specific Jotform. When someone uploads a file and submits the form, this node activates and passes the submission data to the next step. Grab New knowledgebase Type: HTTP Request What it does: The initial trigger from Jotform only contains basic information. This node makes a follow-up call to the Jotform API using the submissionID to get the complete details of that submission, including the specific link to the uploaded file. Grab the uploaded knowledgebase file link Type: HTTP Request What it does: Using the file link obtained from the previous node, this step downloads the actual PDF file. It's set to receive the response as a file, not as text. Extract Text from PDF File Type: Extract From File What it does: This utility node takes the binary PDF file downloaded in the previous step and extracts all the readable text content from it. The output is a single block of plain text. Splitting into Chunks Type: Code What it does: This node runs a small JavaScript snippet. It takes the large block of text from the PDF and chops it into smaller, more manageable pieces, or "chunks," each of a predefined length. This is critical because AI models work more effectively with smaller, focused pieces of text. Embedding Uploaded document Type: HTTP Request What it does: This is a key AI step. It sends each individual text chunk to an embeddings API. A specified AI model converts the semantic meaning of the chunk into a numerical list called an embedding or vector. This vector is like a mathematical fingerprint of the text's meaning. Save the embedding in DB Type: Supabase What it does: This node connects to your Supabase database. For every chunk, it creates a new row in a specified table and stores two important pieces of information: the original text chunk and its corresponding numerical embedding (its "fingerprint") from the previous step. Part 2: Answering Questions via Chat This second sequence starts when a user sends a message. It uses the knowledge stored in the database to find relevant information and generate an intelligent answer. When chat message received Type: Chat Trigger What it does: This node starts the second part of the workflow. It listens for any incoming message from a user in a connected chat application. Embend User Message Type: HTTP Request What it does: This node takes the user's question and sends it to the exact same embeddings API and model used in Part 1. This converts the question's meaning into the same kind of numerical vector or "fingerprint." Search Embeddings Type: HTTP Request What it does: This is the "retrieval" step. It calls a custom database function in Supabase. It sends the question's embedding to this function and asks it to search the knowledge base table to find a specified number of top text chunks whose embeddings are mathematically most similar to the question's embedding. Aggregate Type: Aggregate What it does: The search from the previous step returns multiple separate items. This utility node simply bundles those items into a single, combined piece of data. This makes it easier to feed all the context into the final AI model at once. AI Agent & Google Gemini Chat Model Type: LangChain Agent & AI Model What it does: This is the "generation" step where the final answer is created. The AI Agent node is given a detailed set of instructions (a prompt). The prompt tells the Google Gemini Chat Model to act as a professional support agent. Crucially, it provides the AI with the user's original question and the aggregated text chunks from the Aggregate node as its only source of truth. It then instructs the AI to formulate an answer based only on that provided context, format it for a specific chat style, and to say "I don't know" if the answer cannot be found in the chunks. This prevents the AI from making things up.