by Anurag Patil
Geekhack Discord Updater How It Works This n8n workflow automatically monitors GeekHack forum RSS feeds every hour for new keyboard posts in Interest Checks and Group Buys sections. When it finds a new thread (not replies), it: Monitors RSS Feeds: Checks two GeekHack RSS feeds for new posts (50 items each) Filters New Threads: Removes reply posts by checking for "Re:" prefix in titles Prevents Duplicates: Queries PostgreSQL database to skip already-processed threads Scrapes Content: Fetches the full thread page and extracts the original post Extracts Images: Uses regex to find all images in the post content Creates Discord Embed: Formats the post data into a rich Discord embed with up to 4 images Sends to Multiple Webhooks: Retrieves all webhook URLs from database and sends to each one Logs Processing: Records the thread as processed to prevent duplicates The workflow includes a webhook management system with a web form to add/remove Discord webhooks dynamically, allowing you to send notifications to multiple Discord servers or channels. Steps to Set Up Prerequisites n8n instance running PostgreSQL database Discord webhook URL(s) 1. Database Setup Create PostgreSQL tables: Processed threads table: CREATE TABLE processed_threads ( topic_id VARCHAR PRIMARY KEY, title TEXT, processed_at TIMESTAMP DEFAULT NOW() ); Webhooks table: CREATE TABLE webhooks ( id SERIAL PRIMARY KEY, url TEXT NOT NULL, created_at TIMESTAMP DEFAULT NOW() ); 2. n8n Configuration Import Workflow Copy the workflow JSON Go to n8n → Workflows → Import from JSON Paste the JSON and import Configure Credentials PostgreSQL: Create new PostgreSQL credential with your database connection details All PostgreSQL nodes should use the same credential 3. Node Configuration Schedule Trigger Already configured for 1-hour intervals Modify if different timing needed PostgreSQL Nodes Ensure all PostgreSQL nodes use your PostgreSQL credential: "Check if Processed" "Update entry" "Insert rows in a table" "Select rows from a table" Database schema should be "public" Table names: "processed_threads" and "webhooks" RSS Feed Limits Both RSS feeds are set to limit=50 items Adjust if you need more/fewer items per check 4. Webhook Management Adding Webhooks via Web Form The workflow creates a form trigger for adding webhooks Access the form URL from the "On form submission" node Submit Discord webhook URLs through the form Webhooks are automatically stored in the database Manual Webhook Addition Alternatively, insert webhooks directly into the database: INSERT INTO webhooks (url) VALUES ('https://discord.com/api/webhooks/YOUR_WEBHOOK_URL'); 5. Testing Test the Main Workflow Ensure you have at least one webhook in the database Activate the workflow Use "Execute Workflow" to test manually Check Discord channels for test messages Test Webhook Form Get the form URL from "On form submission" node Submit a test webhook URL Verify it appears in the webhooks table 6. Monitoring Check execution history for errors Monitor both database tables for entries Verify all registered webhooks receive notifications Adjust schedule timing if needed 7. Managing Webhooks Use the web form to add new webhook URLs Remove webhooks by deleting from the database: DELETE FROM webhooks WHERE url = 'webhook_url_to_remove'; The workflow will now automatically post new GeekHack threads to all registered Discord webhooks every hour, with the ability to dynamically manage webhook destinations through the web form interface.
by Sandeep Patharkar | ai-solutions.agency
Build an AI HR Assistant to Screen Resumes and Send Telegram Alerts A step-by-step guide to creating a fully automated recruitment pipeline that screens candidates, generates interview questions, and notifies your team. This template provides a complete, step-by-step guide to building an AI-powered HR assistant from scratch in n8n. You will learn how to connect a web form to an intelligent screening agent that reads resumes, evaluates candidates against your job criteria, and prepares unique interview questions for the most promising applicants. | Services Used | Features | | :---------------------------------------------- | :----------------------------------------------------------------------------- | | 🤖 OpenAI / LangChain | Uses AI Agents to screen, score, and analyze candidates. | | 📄 Google Drive & Google Sheets | Stores resumes and manages a database of open positions and applicants. | | 📥 n8n Form Trigger | Provides a public-facing web form to capture applications. | | 💬 Telegram | Sends real-time alerts to the hiring team for qualified candidates. | How It Works ⚙️ 📥 Application Submitted: The workflow starts when a candidate fills out the n8n Form Trigger with their details and uploads their CV. 📂 File Processing: The CV is automatically uploaded to a specific Google Drive folder for record-keeping, and the Extract from File node reads its text content. 🧠 AI Screening Agent: A LangChain Agent analyzes the resume text. It uses the Google Sheets Tool to look up the requirements for the applied role, then scores the candidate and decides if they should be shortlisted. 📊 Log Results: The agent's decision (name, score, shortlisted status) is logged in your master "Applications" Google Sheet. ✅ Qualification Check: An IF node checks if the candidate was shortlisted. ❓ AI Question Generator: If shortlisted, a second LangChain Agent generates three unique, relevant interview questions based on the candidate's resume and the job description. ✍️ Update Sheet: The generated questions are added to the candidate's row in the Google Sheet. 🔔 Notify Team: A final alert is sent via Telegram to notify the HR team that a new candidate has been qualified and is ready for review. 🛠️ How to Build This Workflow Follow these steps to build the recruitment assistant from a blank canvas. Step 1: Set Up the Application Intake Add a Form Trigger node. Configure it with fields for Name, Email, Phone Number, a File Upload for the CV, and a Dropdown for the "Job Role". Connect a Google Drive node. Set the Operation to Upload and connect your credentials. Set it to upload the CV file from the Form Trigger into a specific folder. Add an Extract from File node. Set it to extract text from the PDF CV file provided by the trigger. Step 2: Build the AI Screening Agent Add a Langchain Agent node. This will be your main screening agent. In its prompt, instruct the AI to act as a resume screener. Tell it to use the input text from the Extract from File node and the tools you will provide to score and shortlist candidates. Add an OpenAI Chat Model node and connect it to the Agent's Language Model input. Add a Google Sheets Tool node. Point it to a sheet with your open positions and their requirements. Connect this to the Agent's Tool input. Add a Structured Output Parser node and define the JSON structure you want the agent to return (e.g., candidate_name, score, shortlisted). Connect this to the Agent's Output Parser input. Step 3: Log Results & Check for a Match Connect a Google Sheets node after the Agent. Set its operation to Append or Update. Use it to add the structured output from the agent into your main "Applications" sheet. Add an IF node. Set the condition to continue only if the shortlisted field equals "yes". Step 4: Generate Interview Questions On the 'true' path of the IF node, add a second Langchain Agent node. Write a prompt telling this agent to generate 3 interview questions based on the candidate's resume and the job requirements. Connect the same OpenAI Model and Google Sheets Tool to this agent. Add another Google Sheets node. Set it to Update the existing row for the candidate, adding the newly generated questions. 💬 Need Help or Want to Learn More? Join my Skool community for n8n + AI automation tutorials, live Q&A sessions, and exclusive workflows: 👉 https://www.skool.com/n8n-ai-automation-champions Template Author: Sandeep Patharkar Category: Website Chatbots / AI Automation Difficulty: Beginner Estimated Setup Time: ⏱️ 15 minutes
by Spiritech Studio
This n8n template demonstrates how to automatically extract text content from PDF documents received via WhatsApp messages using OCR. It is designed for use cases where users submit documents through WhatsApp and the document content needs to be digitized for further processing — such as document analysis, AI-powered workflows, compliance checks, or data ingestion. Good to know This workflow processes PDF documents only. OCR is handled using AWS Textract, which supports both scanned and digital PDFs. AWS Textract pricing depends on the number of pages processed. Refer to AWS Textract Pricing for up-to-date costs. An AWS S3 bucket is required as an intermediate storage layer for the PDF files. Processing time may vary depending on PDF size and number of pages. How it works The workflow is triggered when an incoming WhatsApp message containing a PDF document is received. The PDF file is downloaded from WhatsApp’s media endpoint using an HTTP Request node. The downloaded PDF is uploaded to an AWS S3 bucket to make it accessible for OCR processing. AWS Textract is invoked to analyze the PDF stored in S3 and extract all readable text content. The Textract response is parsed and consolidated into a clean, ordered text output representing the PDF’s content. How to use The workflow can be triggered using a webhook connected to WhatsApp Cloud API or any compatible WhatsApp integration. Ensure your AWS credentials have permission to upload to S3 and invoke Textract. Once active, simply send a PDF document via WhatsApp to start the extraction process automatically. Requirements WhatsApp integration (e.g. WhatsApp Cloud API or provider webhook) AWS account with: S3 bucket access Textract permissions n8n instance with HTTP Request and AWS nodes configured Customising this workflow Store extracted text in a database or document store. Pass the extracted content to an AI model for summarization, classification, or validation. Split output by pages or sections. Add file type validation or size limits. Extend the workflow to support additional document formats.
by Wolfgang Renner
This template is a fully automated AI invoice processing workflow for n8n. It watches a Google Drive folder for new invoice PDFs, extracts all key information using an AI Agent, assigns the correct booking account, saves the renamed invoice in the right Drive folder, and updates your Google Sheets booking list. A perfect starter template if you want to build your own AI-powered accounting automation. What this workflow does Monitors a Google Drive folder for new invoice PDFs. Downloads and extracts invoice text from the uploaded PDF. Uses an AI Agent (OpenAI + Structured Output Parser) to extract: invoice date vendor currency total amount invoice number booking text booking account matching Google Drive folder ID Automatically renames the PDF to a clean, consistent format (e.g. 250912 Vendor.pdf). Saves the invoice into the correct accounting folder in Google Drive. Updates your booking list in Google Sheets with all extracted fields. Moves the processed invoice to an output folder to avoid duplicates. Everything runs hands-free after setup. Key features 🧠 AI Invoice Reading using OpenAI + LangChain 📑 Structured Output Parser guarantees clean, validated fields 📁 Automated Google Drive File Routing 📊 Google Sheets logging for accounting records 🔄 File movement logic to keep input/output folders organized ⚙️ Chart of Accounts integration from your Google Sheet 🟦 Works out of the box with Invoice Agent – Folder Structure Setup (recommended) Typical use cases Automated accounting workflows Pre-processing invoices before importing into ERP or sevDesk AI-powered invoice extraction for small businesses or freelancers Structured archiving of invoices for tax and audit requirements Fully automated Google Drive invoice inbox How to use this template Connect your Google Drive & Sheets credentials in all relevant nodes. Select your: • Input folder (where invoices are uploaded) • Output folder (where processed invoices go) • Folder structure sheet + booking accounts sheet Upload any invoice PDF into the input folder. The workflow starts automatically and processes the invoice end-to-end. ⸻ Requirements • Google Drive OAuth2 • Google Sheets OAuth2 • OpenAI API key • A Google Sheet containing your chart of accounts • A prepared folder structure (use the “Google Drive Structure Setup” template)
by Austin Lee
How it works Provide your S3 bucket containing documents such as PDFs and MS Word in the "Get Files from S3" node. You will need to provide AWS credentials that will allow the node to access the bucket and download the files in the specified location. Choose document processing options in the Aryn node. The main options are for text and table extraction. You can also provide a JSON schema for property extraction. You can refer to https://docs.aryn.ai/docparse/processing_options for details on these options. You will also need an Aryn API key which you can obtain by going to https://aryn.ai/signup. Please note that use of vision models for OCR and table extraction is restricted to paid tiers. The resulting content of parsing and extraction is then chunked and ingested into Pinecone. Once at least one document has been ingested into a Pinecone index, you can start asking questions about anything that may be found in ingested documents in the chat box. Setup steps For data retrieval, you will need a "folder" in a bucket on AWS S3 as well as valid AWS credentials with permission to fetch those files. For document parsing, you will need to obtain an Aryn API key. You can sign up for free at https://aryn.ai/signup. For the Pinecone vector database, head over to https://pinecone.io and create an account and create a sample index for free. You will also need to generate an API key. For the AI agent and RAG, you will also need an OpenAI API key. Please go to https://openai.com and get a free API key.
by Meak
Auto-Call Leads from Google Sheets with VAPI → Log Results + Book Calendar This workflow calls new leads from a Google Sheet using VAPI, saves the call results, and (if there’s a booking request) creates a Google Calendar event automatically. Benefits Auto-call each new lead from your call list Save full call outcomes back to Google Sheets Parse “today/tomorrow + time” into a real datetime (IST) Auto-create calendar events for bookings/deliveries Batch-friendly to avoid rate limits How It Works Trigger: New row in Google Sheets (call_list). Prepare: Normalize phone (adds +), then process in batches. Call: Send number to VAPI (/call) with your assistantId + phoneNumberId. Receive: VAPI posts results to your Webhook. Store: Append/Update Google Sheet with: name, role, company, phone, email, interest level, objections, next step, notes, etc. Parse Time: Convert today/tomorrow + HH:MM AM/PM to start/end in IST (+1 hour). Book: Create Google Calendar event with the parsed times. Respond: Send response back to VAPI to complete the cycle. Who Is This For Real estate / local service teams running outbound calls Agencies doing voice outreach and appointment setting Ops teams that want call logs + auto-booking in one place Setup Google Sheets Trigger:** select your spreadsheet Vapi_real-estate and tab call_list. VAPI Call:** set assistantId, phoneNumberId, and add Bearer token. Webhook:** copy the n8n webhook URL into VAPI so results post back. Google Calendar:** set the calendar ID (e.g., you@domain.com). Timezone:* the booking parser formats times to *Asia/Kolkata (IST)**. Batching:** adjust SplitInBatches size to control pace. ROI & Monetization Save 2–4 hours/week on manual dialing + data entry Faster follow-ups with instant booking creation Package as an “AI Caller + Auto-Booking” service ($1k–$3k/month) Strategy Insights In the full walkthrough, I show how to: Map VAPI tool call JSON safely into Sheets fields Handle missing/invalid times and default to safe slots Add no-answer / retry logic and opt-out handling Extend to send Slack/email alerts for hot leads Check Out My Channel For more voice automation workflows that turn leads into booked calls, check out my YouTube channel where I share the exact setups I use to win clients and scale to $20k+ monthly revenue.
by Jay Emp0
🐱 MemeCoin Art Generator - using Gemini Flash NanoBanana & upload to Twitter Automatically generates memecoin art and posts it to Twitter (X) powered by Google Gemini, NanoBanana image generation, and n8n automation. 🧩 Overview This workflow creates viral style memecoin images (like Popcat) and posts them directly to Twitter with a witty, Gen Z style tweet. It combines text to image AI, scheduled triggers, and social publishing, all in one seamless flow. Workflow flow: Define your memecoin mascot (name, description, and base image URL). Generate an AI image prompt and a meme tweet. Feed the base mascot image into Gemini Image Generation API. Render a futuristic memecoin artwork using NanoBanana. Upload the final image and tweet automatically to Twitter. 🧠 Workflow Diagram ⚙️ Key Components | Node | Function | |------|-----------| | Schedule Trigger | Runs automatically at chosen intervals to start meme generation. | | Define Memecoin | Defines mascot name, description, and base image URL. | | AI Agent | Generates tweet text and creative image prompt using Google Gemini. | | Google Gemini Chat Model | Provides trending topic context and meme phrasing. | | Get Source Image | Fetches the original mascot image (e.g., Popcat). | | Convert Source Image to Base64 | Prepares image for AI based remixing. | | Generate Image using NanoBanana | Sends the prompt and base image to Gemini Image API for art generation. | | Convert Base64 to PNG | Converts the AI output to an image file. | | Upload to Twitter | Uploads generated image to Twitter via media upload API. | | Create Tweet | Publishes the tweet with attached image. | 🪄 How It Works 1️⃣ Schedule Trigger - starts the automation (e.g., hourly or daily). 2️⃣ Define Memecoin - stores your mascot metadata: memecoin_name: popcat mascot_description: cat with open mouth mascot_image: https://i.pinimg.com/736x/9d/05/6b/9d056b5b97c0513a4fc9d9cd93304a05.jpg 3️⃣ AI Agent - prompts Gemini to: Write a short 100 character tweet in Gen Z slang. Create an image generation prompt inspired by current meme trends. 4️⃣ NanoBanana API - applies your base image + AI prompt to create art. 5️⃣ Upload & Tweet - final image gets uploaded and posted automatically. 🧠 Example Output Base Source Image: Generated Image (AI remix): Published Tweet: Example tweet text: > Popcat's about to go absolutely wild, gonna moon harder than my last test score! 🚀📈 We up! #Popcat #Memecoin 🧩 Setup Tutorial 1️⃣ Prerequisites | Tool | Purpose | |------|----------| | n8n (Cloud or Self hosted) | Workflow automation platform | | Google Gemini API Key | For generating tweet and image prompts | | Twitter (X) API OAuth1 + OAuth2 | For uploading and posting tweets | 2️⃣ Import the Workflow Download memecoin art generator.json. In n8n, click Import Workflow → From File. Set up and connect credentials: Google Gemini API Twitter OAuth (Optional) Adjust Schedule Trigger frequency to your desired posting interval. 3️⃣ Customize Your MemeCoin In the Define Memecoin node, edit these fields to change your meme theme: memecoin_name: "doggo" mascot_description: "shiba inu in astronaut suit" mascot_image: "https://example.com/shiba.jpg" That’s it - next cycle will generate your new meme and post it. 4️⃣ API Notes Gemini Image Generation API Docs:** https://ai.google.dev/gemini-api/docs/image-generation#gemini-image-editing API Key Portal:** https://aistudio.google.com/api-keys
by Cleverton Ruppenthal
🤖 Generate images via Telegram using an AI bot with a credit system and S3 storage A complete, production-ready Telegram bot for AI-powered image generation and editing, featuring a built-in credit system, payment integration, and cloud storage. ✨ Features 🎨 AI Image Generation Text-to-Image: Generate stunning images from text prompts using the **Nano Banana Pro Ultra model via WaveSpeed API Image-to-Image Editing**: Edit existing images by sending a photo with a caption describing the desired changes Multiple Resolutions: Support for both **4K and 8K output quality Flexible Aspect Ratios**: Choose from 10 different aspect ratios (1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9) 💳 Credit Management System Per-generation billing**: Different costs for 4K vs 8K resolution Balance tracking**: Real-time credit balance displayed to users Initial credits**: New users receive starter credits automatically Insufficient balance handling**: Graceful messages when credits run out Here's the adjusted payment section: 💰 Payment Integration (Mercado Pago PIX) > About PIX: PIX is Brazil's instant payment system, launched by the Central Bank of Brazil in 2020. It allows instant money transfers 24/7, 365 days a year, using QR codes or copy-paste codes. It has become the most popular payment method in Brazil due to its speed and zero fees for individuals. This workflow includes a fully integrated PIX payment flow as a reference implementation. You can adapt it to your local payment provider. Features: Multiple deposit options**: Pre-configured credit packages (R$ 3, R$ 6, R$ 10) QR Code generation**: Automatic PIX QR code sent directly to users via Telegram Copy-paste code**: PIX code provided for manual payment Webhook confirmation**: Real-time payment status updates via Mercado Pago webhooks Auto credit top-up**: Credits added automatically upon payment approval Payment status handling**: Supports approved, pending, and rejected states > 💡 Tip: To adapt this for other regions, replace the Mercado Pago nodes with your preferred payment gateway (Stripe, PayPal, etc.) while keeping the same credit update logic. ⚙️ User Configuration Resolution settings**: Users can set their preferred default resolution Aspect ratio preferences**: Save preferred aspect ratio for future generations Custom default prompts**: Set a default prompt that's automatically appended to all generations Persistent settings**: All preferences stored in n8n Data Tables 📦 Cloud Storage (S3/MinIO) Automatic upload**: Generated images are automatically uploaded to S3-compatible storage Persistent URLs**: Images remain accessible via permanent links Edit from storage**: Reference previously uploaded images for editing 🛠️ Tech Stack | Component | Technology | |-----------|------------| | Bot Platform | Telegram Bot API | | AI Generation | WaveSpeed API (Nano Banana Pro) | | Storage | S3-compatible (MinIO/AWS S3) | | Database | n8n Data Tables | | Payments | Mercado Pago PIX | | Automation | n8n | 📋 Prerequisites Before using this workflow, you'll need: Telegram Bot Token - Create a bot via @BotFather WaveSpeed API Key - Sign up at WaveSpeed S3-compatible Storage - MinIO, AWS S3, or any S3-compatible service Mercado Pago Account (optional) - For payment integration n8n Data Table - Create a table with the required schema 📊 Data Table Schema Create a Data Table with the following columns: | Column | Type | Description | |--------|------|-------------| | chat_id | String | Telegram chat ID (primary key) | | username | String | Telegram username | | status | String | Current user state in the flow | | credits | String | User's credit balance | | resolution | String | Preferred resolution (4k/8k) | | aspect_ratio | String | Preferred aspect ratio | | user_default_prompt | String | Custom default prompt | | number_images | Number | Total images generated | | number_videos | Number | Total videos generated | | demo_sended | Boolean | Welcome demo sent flag | ⚡ Quick Setup Import the workflow into your n8n instance Configure the Global Environment node with your settings: botName: Your bot's display name botToken: Your Telegram bot token dataTableId: Your n8n Data Table ID bucketName: Your S3 bucket name initialCredits: Credits given to new users generateImageCost4k: Cost per 4K image generateImageCost8k: Cost per 8K image Set up credentials: Telegram API credentials WaveSpeed API credentials S3 credentials Mercado Pago credentials (if using payments) Activate the workflow 🎮 Bot Commands | Command | Description | |---------|-------------| | /start | Initialize bot and receive welcome message | | menu | Return to main menu | | config | Open settings menu | | Any text | Generate image from prompt (when in generation mode) | | Photo + caption | Edit the photo based on the caption | 🔄 Workflow Flow User Message → Telegram Trigger → Route by Status ↓ ┌───────────────────┼───────────────────┐ ↓ ↓ ↓ New User Generate Image Edit Image ↓ ↓ ↓ Welcome Flow Check Credits Check Credits ↓ ↓ ↓ Create User Submit to WaveSpeed Upload to S3 ↓ ↓ ↓ Show Menu Poll for Result Submit Edit ↓ ↓ Download Image Poll for Result ↓ ↓ Send to User ←←←←←←←←←←←←←┘ 📝 Notes The workflow uses polling to check generation status - WaveSpeed processes may take up to 1 minute Credits are deducted when the task is submitted and refunded if generation fails All user states are managed through the Data Table for persistence across restarts 📄 License Free to use and modify. Attribution appreciated but not required.
by Ranjan Dailata
This workflow automates company research and intelligence extraction from Glassdoor using Decode API for data retrieval and Google Gemini for AI-powered summarization. Who this is for This workflow is ideal for: Recruiters, analysts, and market researchers looking for structured insights from company profiles. HR tech developers and AI research teams needing a reliable way to extract and summarize Glassdoor data automatically. Venture analysts or due diligence teams conducting company research combining structured and unstructured content. Anyone who wants instant summaries and insights from Glassdoor company pages without manual scraping. What problem this workflow solves Manual Data Extraction**: Glassdoor company details and reviews are often scattered and inconsistent, requiring time-consuming copy-paste efforts. Unstructured Insights**: Raw reviews contain valuable opinions but are not organized for analytical use. Fragmented Company Data**: Key metrics like ratings, pros/cons, and FAQs are mixed with irrelevant data. Need for AI Summarization**: Business users need a concise, executive-level summary that combines employee sentiment, culture, and overall performance metrics. This workflow automates data mining, summarization, and structuring, transforming Glassdoor data into ready-to-use JSON and Markdown summaries. What this workflow does The workflow automates the end-to-end pipeline for Glassdoor company research: Trigger Start manually by clicking “Execute Workflow.” Set Input Fields Define company_url (e.g., a Glassdoor company profile link) and geo (country). Extract Raw Data from Glassdoor (Decodo Node) Uses the Decodo API to fetch company data — including overview, ratings, reviews, and frequently asked questions. Generate Structured Data (Google Gemini + Output Parser) The Structured Data Extractor node (powered by Gemini AI) processes raw data into well-defined fields: Company overview (name, size, website, type) Ratings breakdown Review snippets (pros, cons, roles) FAQs Key takeaways Summarize the Insights (Gemini AI Summarizer) Produces a detailed summary highlighting: Company reputation Work culture Employee sentiment trends Strengths and weaknesses Hiring recommendations Merge and Format Combines structured data and summary into a unified object for output. Export and Save Converts the final report into JSON and writes it to disk as C:\{{CompanyName}}.json. Binary Encoding for File Handling Prepares data in base64 for easy integration with APIs or downloadable reports. Setup Prerequisites n8n instance** (cloud or self-hosted) Decodo API credentials** (added as decodoApi) Google Gemini (PaLM) API credentials** Access to the Glassdoor company URLs Make sure to install the Decodo Community Node. Steps Import this workflow JSON file into your n8n instance. Configure your credentials for: Decodo API Google Gemini (PaLM) API Open the Set the Input Fields node and replace: company_url → with the Glassdoor URL geo → with the region (e.g., India, US, etc.) Execute the workflow. Check your output folder (C:\) for the exported JSON report. How to Customize This Workflow You can easily adapt this template to your needs: Add Sentiment Analysis** Include another Gemini or OpenAI node to rate sentiment (positive/negative/neutral) per review. Export to Notion or Google Sheets** Replace the file node with a Notion or Sheets integration for live dashboarding. Multi-Company Batch Mode** Convert the manual trigger to a spreadsheet or webhook trigger for bulk research automation. Add Visualization Layer** Connect the output to Looker Studio or Power BI for analytical dashboards. Change Output Format** Modify the final write node to generate Markdown or PDF summaries using the pypandoc or reportlab module. Summary This n8n workflow combines Decode web scrapping with Google Gemini’s reasoning and summarization power to build a fully automated Glassdoor Research Engine. With a single execution, it: Extracts structured company details Summarizes thousands of employee reviews Delivers insights in an easy-to-consume format Ideal for: Recruitment intelligence Market research Employer branding Competitive HR analysis
by Omer Fayyaz
An automated PDF download and management system that collects PDFs from URLs, uploads them to Google Drive, extracts metadata, and maintains a searchable library with comprehensive error handling and status tracking. What Makes This Different: Intelligent URL Validation** - Validates PDF URLs before attempting download, extracting filenames from URLs and generating fallback names when needed, preventing wasted processing time Binary File Handling** - Properly handles PDF downloads as binary files with appropriate headers (User-Agent, Accept), ensuring compatibility with various PDF hosting services Comprehensive Error Handling** - Three-tier error handling: invalid URLs are marked immediately, failed downloads are logged with error messages, and all errors are tracked in a dedicated Error Log sheet Metadata Extraction** - Automatically extracts file ID, size, MIME type, Drive view links, and download URLs from Google Drive responses, creating a complete file record Multiple Trigger Options** - Supports manual execution, scheduled runs (every 12 hours), and workflow-to-workflow calls, making it flexible for different automation scenarios Status Tracking** - Updates source spreadsheet with processing status (Downloaded, Failed, Invalid), enabling easy monitoring and retry logic for failed downloads Key Benefits of Automated PDF Management: Centralized Storage** - All PDFs are automatically organized in a Google Drive folder, making them easy to find and share across your organization Searchable Library** - Metadata is stored in Google Sheets with file links, titles, sources, and download dates, enabling quick searches and filtering Error Recovery** - Failed downloads are logged with error messages, allowing you to identify and fix issues (broken links, access permissions, etc.) and retry Automated Processing** - Schedule-based execution keeps your PDF library updated without manual intervention, perfect for monitoring research sources Integration Ready** - Can be called by other workflows, enabling complex automation chains (e.g., scrape URLs → download PDFs → process content) Bulk Processing** - Processes multiple PDFs in sequence from a spreadsheet, handling large batches efficiently with proper error isolation Who's it for This template is designed for researchers, academic institutions, market research teams, legal professionals, compliance officers, and anyone who needs to systematically collect and organize PDF documents from multiple sources. It's perfect for organizations that need to build research libraries, archive regulatory documents, collect industry reports, maintain compliance documentation, or aggregate academic papers without manually downloading and organizing each file. How it works / What it does This workflow creates a PDF collection and management system that reads PDF URLs from Google Sheets, downloads the files, uploads them to Google Drive, extracts metadata, and maintains a searchable library. The system: Reads Pending PDF URLs - Fetches PDF URLs from Google Sheets "PDF URLs" sheet, processing entries that need to be downloaded Loops Through PDFs - Processes PDFs one at a time using Split in Batches, ensuring proper error isolation and preventing batch failures Prepares Download Info - Extracts filename from URL, decodes URL-encoded characters, validates PDF URL format, and generates fallback filenames with timestamps if needed Validates URL - Checks if URL is valid before attempting download, skipping invalid entries immediately Downloads PDF - Makes HTTP request with proper browser headers, downloads PDF as binary file with 60-second timeout, handles download errors gracefully Verifies Download - Checks if binary data was successfully received, routing to error handling if download failed Uploads to Google Drive - Uploads PDF file to specified Google Drive folder, preserving original filename or using generated name Extracts File Metadata - Extracts file ID, name, MIME type, file size, Drive view link, and download link from Google Drive API response Saves to PDF Library - Appends file metadata to Google Sheets "PDF Library" sheet with title, source, file links, and download timestamp Updates Source Status - Marks processed URLs as "Downloaded", "Failed", or "Invalid" in source sheet for tracking Logs Errors - Records failed downloads and invalid URLs in "Error Log" sheet with error messages for troubleshooting Tracks Completion - Generates completion summary with processing statistics and timestamp Key Innovation: Error-Resilient Processing - Unlike simple download scripts that fail on the first error, this workflow isolates failures, continues processing remaining PDFs, and provides detailed error logging. This ensures maximum success rate and makes troubleshooting straightforward. How to set up 1. Prepare Google Sheets Create a Google Sheet with three tabs: "PDF URLs", "PDF Library", and "Error Log" In "PDF URLs" sheet, create columns: PDF_URL (or pdf_url), Title (optional), Source (optional), Status (optional - will be updated by workflow) Add sample PDF URLs in the PDF_URL column (e.g., direct links to PDF files) The "PDF Library" sheet will be automatically populated with columns: pdfUrl, title, source, fileName, fileId, mimeType, fileSize, driveUrl, downloadUrl, downloadedAt, status The "Error Log" sheet will record: status, errorMessage, pdfUrl, title (for failed downloads) Verify your Google Sheets credentials are set up in n8n (OAuth2 recommended) 2. Configure Google Sheets Nodes Open the "Read Pending PDF URLs" node and select your spreadsheet from the document dropdown Set sheet name to "PDF URLs" Configure the "Save to PDF Library" node: select same spreadsheet, set sheet name to "PDF Library", operation should be "Append or Update" Configure the "Update Source Status" node: same spreadsheet, "PDF URLs" sheet, operation "Update" Configure the "Log Error" node: same spreadsheet, "Error Log" sheet, operation "Append or Update" Test connection by running the "Read Pending PDF URLs" node manually to verify it can access your sheet 3. Set Up Google Drive Folder Create a folder in Google Drive where you want PDFs stored (e.g., "PDF Reports" or "Research Library") Open the "Upload to Google Drive" node Select your Google Drive account (OAuth2 credentials) Choose the drive (usually "My Drive") Select the folder you created from the folder dropdown The filename will be automatically extracted from the URL or generated with timestamp Verify folder permissions allow the service account to upload files Test by manually uploading a file to ensure access works 4. Configure Download Settings The "Download PDF" node is pre-configured with appropriate headers and 60-second timeout If you encounter timeout issues with large PDFs, increase timeout in the node options The User-Agent header is set to mimic a browser to avoid blocking Accept header is set to application/pdf,application/octet-stream,/ for maximum compatibility For sites requiring authentication, you may need to add additional headers or use cookies Test with a sample PDF URL to verify download works correctly 5. Set Up Scheduling & Test The workflow includes Manual Trigger (for testing), Schedule Trigger (runs every 12 hours), and Execute Workflow Trigger (for calling from other workflows) To customize schedule: Open "Schedule (Every 12 Hours)" node and adjust interval (e.g., daily, weekly) For initial testing: Use Manual Trigger, add 2-3 test PDF URLs to your "PDF URLs" sheet Verify execution: Check that PDFs are downloaded, uploaded to Drive, and metadata saved to "PDF Library" Monitor execution logs: Check for any download failures, timeout issues, or Drive upload errors Review Error Log sheet: Verify failed downloads are properly logged with error messages Common issues: Invalid URLs (check URL format), access denied (check file permissions), timeout (increase timeout for large files), Drive quota (check Google Drive storage) Requirements Google Sheets Account** - Active Google account with OAuth2 credentials configured in n8n for reading and writing spreadsheet data Google Drive Account** - Same Google account with OAuth2 credentials and sufficient storage space for PDF files Source Spreadsheet** - Google Sheet with "PDF URLs", "PDF Library", and "Error Log" tabs, properly formatted with required columns Valid PDF URLs** - Direct links to PDF files (not HTML pages that link to PDFs) - URLs should end in .pdf or point directly to PDF content n8n Instance** - Self-hosted or cloud n8n instance with access to external websites (HTTP Request node needs internet connectivity to download PDFs)
by Parag Javale
The AI Blog Creator with Gemini, Replicate Image, Supabase Publishing & Slack is a fully automated content generation and publishing workflow designed for modern marketing and SaaS teams. It automatically fetches the latest industry trends, generates SEO-optimized blogs using AI, creates a relevant featured image, publishes the post to your CMS (e.g., Supabase or custom API), and notifies your team via Slack all on a daily schedule. This workflow connects multiple services NewsAPI, Google Gemini, Replicate, Supabase, and Slack into one intelligent content pipeline that runs hands-free once set up. ✨ Features 📰 Fetch Trending Topics — pulls the latest news or updates from your selected industry (via NewsAPI). 🤖 AI Topic Generation — Gemini suggests trending blog topics relevant to AI, SaaS, and Automation. 📝 AI Blog Authoring — Gemini then writes a full 1200-1500 word SEO-optimized article in Markdown. 🧹 Smart JSON Cleaner — A resilient code node parses Gemini’s output and ensures clean, structured data. 🖼️ Auto-Generated Image — Replicate’s Ideogram model creates a blog cover image based on the content prompt. 🌐 Automatic Publishing — Posts are automatically published to your Supabase or custom backend. 💬 Slack Notification — Notifies your team with blog details and live URL. ⏰ Fully Scheduled — Runs automatically every day at your preferred time (default 10 AM IST). ⚙️ Workflow Structure | Step | Node | Purpose | | ---- | ----------------------------------- | ----------------------------------------------- | | 1 | Schedule Trigger | Runs daily at 10 AM | | 2 | Fetch Industry Trends (NewsAPI) | Retrieves trending articles | | 3 | Message a model (Gemini) | Generates trending topic ideas | | 4 | Message a model1 (Gemini) | Writes full SEO blog content | | 5 | Code in JavaScript | Cleans, validates, and normalizes Gemini output | | 6 | HTTP Request (Replicate) | Generates an image using Ideogram | | 7 | HTTP Request1 | Retrieves generated image URL | | 8 | Wait + If | Polls until image generation succeeds | | 9 | Edit Fields | Assembles blog fields into final JSON | | 10 | Publish to Supabase | Posts to your CMS | | 11 | Slack Notification | Sends message to your Slack channel | 🔧 Setup Instructions Import the Workflow in n8n and enable it. Create the following credentials: NewsAPI (Query Auth) — from https://newsapi.org Google Gemini (PaLM API) — use your Gemini API key Replicate (Bearer Auth) — API key from https://replicate.com/account Supabase (Header Auth) — endpoint to your /functions/v1/blog-api (set your key in header) Slack API — create a Slack App token with chat:write permission Edit the NewsAPI URL query parameter to match your industry (e.g., q=AI automation SaaS). Update the Supabase publish URL to your project endpoint if needed. Adjust the Slack Channel name under “Slack Notification”. (Optional) Change the Schedule Trigger time as per your timezone. 💡 Notes & Tips The Code in JavaScript node is robust against malformed or extra text in Gemini output — it sanitizes Markdown and reconstructs clean JSON safely. You can replace Supabase with any CMS or Webhook endpoint by editing the “Publish to Supabase” node. The Replicate model used is ideogram-ai/ideogram-v3-turbo — you can swap it with Stable Diffusion or another model for different aesthetics. Use the slug field in your blog URLs for SEO-friendly links. Test with one manual execution before activating scheduled runs. If Slack notification fails, verify the token scopes and channel permissions. 🧩 Tags #AI #Automation #ContentMarketing #BlogGenerator #n8n #Supabase #Gemini #Replicate #Slack #WorkflowAutomation
by Jaruphat J.
⚠️ Note: This template requires a community node and works only on self-hosted n8n installations. It uses the Typhoon OCR Python package, pdfseparate from poppler-utils, and custom command execution. Make sure to install all required dependencies locally. Who is this for? This template is designed for developers, back-office teams, and automation builders (especially in Thailand or Thai-speaking environments) who need to process multi-file, multi-page Thai PDFs and automatically export structured results to Google Sheets. It is ideal for: Government and enterprise document processing Thai-language invoices, memos, and official letters AI-powered automation pipelines that require Thai OCR What problem does this solve? Typhoon OCR is one of the most accurate OCR tools for Thai text, but integrating it into an end-to-end workflow usually requires manual scripting and handling multi-page PDFs. This template solves that by: Splitting PDFs into individual pages Running Typhoon OCR on each page Aggregating text back into a single file Using AI to extract structured fields Automatically saving structured data into Google Sheets What this workflow does Trigger:** Manual execution or any n8n trigger node Load Files:** Read PDFs from a local doc/multipage folder Split PDF Pages:** Use pdfinfo and pdfseparate to break PDFs into pages Typhoon OCR:** Run OCR on each page via Execute Command Aggregate:** Combine per-page OCR text LLM Extraction:** Use AI (e.g., GPT-4, OpenRouter) to extract fields into JSON Parse JSON:** Convert structured JSON into a tabular format Google Sheets:** Append one row per file into a Google Sheet Cleanup:** Delete temp split pages and move processed PDFs into a Completed folder Setup Install Requirements Python 3.10+ typhoon-ocr: pip install typhoon-ocr poppler-utils: provides pdfinfo, pdfseparate qpdf: backup page counting Create folders /doc/multipage for incoming files /doc/tmp for split pages /doc/multipage/Completed for processed files Google Sheet Create a Google Sheet with column headers like: book_id | date | subject | to | attach | detail | signed_by | signed_by2 | contact_phone | contact_email | contact_fax | download_url API Keys Export your TYPHOON_OCR_API_KEY and OPENAI_API_KEY (or use credentials in n8n) How to customize this workflow Replace the LLM provider in the “Structure Text to JSON with LLM” node (supports OpenRouter, OpenAI, etc.) Adjust the JSON schema and parsing logic to match your documents Update Google Sheets mapping to fit your desired fields Add trigger nodes (Dropbox, Google Drive, Webhook) to automate file ingestion About Typhoon OCR Typhoon is a multilingual LLM and NLP toolkit optimized for Thai. It includes typhoon-ocr, a Python OCR package designed for Thai-centric documents. It is open-source, highly accurate, and works well in automation pipelines. Perfect for government paperwork, PDF reports, and multi-language documents in Southeast Asia. Deployment Option You can also deploy this workflow easily using the Docker image provided in my GitHub repository: https://github.com/Jaruphat/n8n-ffmpeg-typhoon-ollama This Docker setup already includes n8n, ffmpeg, Typhoon OCR, and Ollama combined together, so you can run the whole environment without installing each dependency manually.