by Ranjan Dailata
This workflow automates company research and intelligence extraction from Glassdoor using Decode API for data retrieval and Google Gemini for AI-powered summarization. Who this is for This workflow is ideal for: Recruiters, analysts, and market researchers looking for structured insights from company profiles. HR tech developers and AI research teams needing a reliable way to extract and summarize Glassdoor data automatically. Venture analysts or due diligence teams conducting company research combining structured and unstructured content. Anyone who wants instant summaries and insights from Glassdoor company pages without manual scraping. What problem this workflow solves Manual Data Extraction**: Glassdoor company details and reviews are often scattered and inconsistent, requiring time-consuming copy-paste efforts. Unstructured Insights**: Raw reviews contain valuable opinions but are not organized for analytical use. Fragmented Company Data**: Key metrics like ratings, pros/cons, and FAQs are mixed with irrelevant data. Need for AI Summarization**: Business users need a concise, executive-level summary that combines employee sentiment, culture, and overall performance metrics. This workflow automates data mining, summarization, and structuring, transforming Glassdoor data into ready-to-use JSON and Markdown summaries. What this workflow does The workflow automates the end-to-end pipeline for Glassdoor company research: Trigger Start manually by clicking “Execute Workflow.” Set Input Fields Define company_url (e.g., a Glassdoor company profile link) and geo (country). Extract Raw Data from Glassdoor (Decodo Node) Uses the Decodo API to fetch company data — including overview, ratings, reviews, and frequently asked questions. Generate Structured Data (Google Gemini + Output Parser) The Structured Data Extractor node (powered by Gemini AI) processes raw data into well-defined fields: Company overview (name, size, website, type) Ratings breakdown Review snippets (pros, cons, roles) FAQs Key takeaways Summarize the Insights (Gemini AI Summarizer) Produces a detailed summary highlighting: Company reputation Work culture Employee sentiment trends Strengths and weaknesses Hiring recommendations Merge and Format Combines structured data and summary into a unified object for output. Export and Save Converts the final report into JSON and writes it to disk as C:\{{CompanyName}}.json. Binary Encoding for File Handling Prepares data in base64 for easy integration with APIs or downloadable reports. Setup Prerequisites n8n instance** (cloud or self-hosted) Decodo API credentials** (added as decodoApi) Google Gemini (PaLM) API credentials** Access to the Glassdoor company URLs Make sure to install the Decodo Community Node. Steps Import this workflow JSON file into your n8n instance. Configure your credentials for: Decodo API Google Gemini (PaLM) API Open the Set the Input Fields node and replace: company_url → with the Glassdoor URL geo → with the region (e.g., India, US, etc.) Execute the workflow. Check your output folder (C:\) for the exported JSON report. How to Customize This Workflow You can easily adapt this template to your needs: Add Sentiment Analysis** Include another Gemini or OpenAI node to rate sentiment (positive/negative/neutral) per review. Export to Notion or Google Sheets** Replace the file node with a Notion or Sheets integration for live dashboarding. Multi-Company Batch Mode** Convert the manual trigger to a spreadsheet or webhook trigger for bulk research automation. Add Visualization Layer** Connect the output to Looker Studio or Power BI for analytical dashboards. Change Output Format** Modify the final write node to generate Markdown or PDF summaries using the pypandoc or reportlab module. Summary This n8n workflow combines Decode web scrapping with Google Gemini’s reasoning and summarization power to build a fully automated Glassdoor Research Engine. With a single execution, it: Extracts structured company details Summarizes thousands of employee reviews Delivers insights in an easy-to-consume format Ideal for: Recruitment intelligence Market research Employer branding Competitive HR analysis
by Aadarsh Jain
Document Analyzer and Q&A Workflow AI-powered document and web page analysis using n8n and GPT model. Ask questions about any local file or web URL and get intelligent, formatted answers. Who's it for Perfect for researchers, developers, content analysts, students, and anyone who needs quick insights from documents or web pages without uploading files to external services. What it does Analyzes local files**: PDF, Markdown, Text, JSON, YAML, Word docs Fetches web content**: Documentation sites, blogs, articles Answers questions**: Using GPT model with structured, well-formatted responses Input format: path_or_url | your_question Examples: /Users/docs/readme.md | What are the installation steps? https://n8n.io | What is n8n? Setup Import workflow into n8n Add your OpenAI API key to credentials Link the credential to the "OpenAI Document Analyzer" node Activate the workflow Start chatting! Customize Change AI model → Edit "OpenAI Document Analyzer" node (switch to gpt-4o-mini for cost savings) Adjust content length → Modify maxLength in "Process Document Content" node (default: 15000 chars) Add file types → Update supportedTypes array in "Parse Document & Question" node Increase timeout → Change timeout value in "Fetch Web Content" node (default: 30s)
by Parag Javale
The AI Blog Creator with Gemini, Replicate Image, Supabase Publishing & Slack is a fully automated content generation and publishing workflow designed for modern marketing and SaaS teams. It automatically fetches the latest industry trends, generates SEO-optimized blogs using AI, creates a relevant featured image, publishes the post to your CMS (e.g., Supabase or custom API), and notifies your team via Slack all on a daily schedule. This workflow connects multiple services NewsAPI, Google Gemini, Replicate, Supabase, and Slack into one intelligent content pipeline that runs hands-free once set up. ✨ Features 📰 Fetch Trending Topics — pulls the latest news or updates from your selected industry (via NewsAPI). 🤖 AI Topic Generation — Gemini suggests trending blog topics relevant to AI, SaaS, and Automation. 📝 AI Blog Authoring — Gemini then writes a full 1200-1500 word SEO-optimized article in Markdown. 🧹 Smart JSON Cleaner — A resilient code node parses Gemini’s output and ensures clean, structured data. 🖼️ Auto-Generated Image — Replicate’s Ideogram model creates a blog cover image based on the content prompt. 🌐 Automatic Publishing — Posts are automatically published to your Supabase or custom backend. 💬 Slack Notification — Notifies your team with blog details and live URL. ⏰ Fully Scheduled — Runs automatically every day at your preferred time (default 10 AM IST). ⚙️ Workflow Structure | Step | Node | Purpose | | ---- | ----------------------------------- | ----------------------------------------------- | | 1 | Schedule Trigger | Runs daily at 10 AM | | 2 | Fetch Industry Trends (NewsAPI) | Retrieves trending articles | | 3 | Message a model (Gemini) | Generates trending topic ideas | | 4 | Message a model1 (Gemini) | Writes full SEO blog content | | 5 | Code in JavaScript | Cleans, validates, and normalizes Gemini output | | 6 | HTTP Request (Replicate) | Generates an image using Ideogram | | 7 | HTTP Request1 | Retrieves generated image URL | | 8 | Wait + If | Polls until image generation succeeds | | 9 | Edit Fields | Assembles blog fields into final JSON | | 10 | Publish to Supabase | Posts to your CMS | | 11 | Slack Notification | Sends message to your Slack channel | 🔧 Setup Instructions Import the Workflow in n8n and enable it. Create the following credentials: NewsAPI (Query Auth) — from https://newsapi.org Google Gemini (PaLM API) — use your Gemini API key Replicate (Bearer Auth) — API key from https://replicate.com/account Supabase (Header Auth) — endpoint to your /functions/v1/blog-api (set your key in header) Slack API — create a Slack App token with chat:write permission Edit the NewsAPI URL query parameter to match your industry (e.g., q=AI automation SaaS). Update the Supabase publish URL to your project endpoint if needed. Adjust the Slack Channel name under “Slack Notification”. (Optional) Change the Schedule Trigger time as per your timezone. 💡 Notes & Tips The Code in JavaScript node is robust against malformed or extra text in Gemini output — it sanitizes Markdown and reconstructs clean JSON safely. You can replace Supabase with any CMS or Webhook endpoint by editing the “Publish to Supabase” node. The Replicate model used is ideogram-ai/ideogram-v3-turbo — you can swap it with Stable Diffusion or another model for different aesthetics. Use the slug field in your blog URLs for SEO-friendly links. Test with one manual execution before activating scheduled runs. If Slack notification fails, verify the token scopes and channel permissions. 🧩 Tags #AI #Automation #ContentMarketing #BlogGenerator #n8n #Supabase #Gemini #Replicate #Slack #WorkflowAutomation
by Jaruphat J.
⚠️ Note: This template requires a community node and works only on self-hosted n8n installations. It uses the Typhoon OCR Python package, pdfseparate from poppler-utils, and custom command execution. Make sure to install all required dependencies locally. Who is this for? This template is designed for developers, back-office teams, and automation builders (especially in Thailand or Thai-speaking environments) who need to process multi-file, multi-page Thai PDFs and automatically export structured results to Google Sheets. It is ideal for: Government and enterprise document processing Thai-language invoices, memos, and official letters AI-powered automation pipelines that require Thai OCR What problem does this solve? Typhoon OCR is one of the most accurate OCR tools for Thai text, but integrating it into an end-to-end workflow usually requires manual scripting and handling multi-page PDFs. This template solves that by: Splitting PDFs into individual pages Running Typhoon OCR on each page Aggregating text back into a single file Using AI to extract structured fields Automatically saving structured data into Google Sheets What this workflow does Trigger:** Manual execution or any n8n trigger node Load Files:** Read PDFs from a local doc/multipage folder Split PDF Pages:** Use pdfinfo and pdfseparate to break PDFs into pages Typhoon OCR:** Run OCR on each page via Execute Command Aggregate:** Combine per-page OCR text LLM Extraction:** Use AI (e.g., GPT-4, OpenRouter) to extract fields into JSON Parse JSON:** Convert structured JSON into a tabular format Google Sheets:** Append one row per file into a Google Sheet Cleanup:** Delete temp split pages and move processed PDFs into a Completed folder Setup Install Requirements Python 3.10+ typhoon-ocr: pip install typhoon-ocr poppler-utils: provides pdfinfo, pdfseparate qpdf: backup page counting Create folders /doc/multipage for incoming files /doc/tmp for split pages /doc/multipage/Completed for processed files Google Sheet Create a Google Sheet with column headers like: book_id | date | subject | to | attach | detail | signed_by | signed_by2 | contact_phone | contact_email | contact_fax | download_url API Keys Export your TYPHOON_OCR_API_KEY and OPENAI_API_KEY (or use credentials in n8n) How to customize this workflow Replace the LLM provider in the “Structure Text to JSON with LLM” node (supports OpenRouter, OpenAI, etc.) Adjust the JSON schema and parsing logic to match your documents Update Google Sheets mapping to fit your desired fields Add trigger nodes (Dropbox, Google Drive, Webhook) to automate file ingestion About Typhoon OCR Typhoon is a multilingual LLM and NLP toolkit optimized for Thai. It includes typhoon-ocr, a Python OCR package designed for Thai-centric documents. It is open-source, highly accurate, and works well in automation pipelines. Perfect for government paperwork, PDF reports, and multi-language documents in Southeast Asia. Deployment Option You can also deploy this workflow easily using the Docker image provided in my GitHub repository: https://github.com/Jaruphat/n8n-ffmpeg-typhoon-ollama This Docker setup already includes n8n, ffmpeg, Typhoon OCR, and Ollama combined together, so you can run the whole environment without installing each dependency manually.
by Gegenfeld
AI Background Removal Workflow This workflow automatically removes backgrounds from images stored in Airtable using the APImage API 🡥, then downloads and saves the processed images to Google Drive. Perfect for batch processing product photos, portraits, or any images that need clean, transparent backgrounds. The source (Airtable) and the storage (Google Drive) can be changed to any service or database you want/use. 🧩 Nodes Overview 1. Remove Background (Manual Trigger) This manual trigger starts the background removal process when clicked. Customization Options: Replace with Schedule Trigger for automatic daily/weekly processing Replace with Webhook Trigger to start via API calls Replace with File Trigger to process when new files are added 2. Get a Record (Airtable) Retrieves media files from your Airtable "Creatives Library" database. Connects to the "Media Files" table in your Airtable base Fetches records containing image thumbnails for processing Returns all matching records with their thumbnail URLs and metadata Required Airtable Structure: Table with image/attachment field (currently expects "Thumbnail" field) Optional fields: File Name, Media Type, Upload Date, File Size Customization Options: Replace with Google Sheets, Notion, or any database node Add filters to process only specific records Change to different tables with image URLs 3. Code (JavaScript Processing) Processes Airtable records and prepares thumbnail data for background removal. Extracts thumbnail URLs from each record Chooses best quality thumbnail (large > full > original) Creates clean filenames by removing special characters Adds processing metadata and timestamps Key Features: // Selects best thumbnail quality if (thumbnail.thumbnails?.large?.url) { thumbnailUrl = thumbnail.thumbnails.large.url; } // Creates clean filename cleanFileName: (record.fields['File Name'] || 'unknown') .replace(//g, '_') .toLowerCase() Easy Customization for Different Databases: Product Database**: Change field mappings to 'Product Name', 'SKU', 'Category' Portfolio Database**: Use 'Project Name', 'Client', 'Tags' Employee Database**: Use 'Full Name', 'Department', 'Position' 4. Split Out Converts the array of thumbnails into individual items for parallel processing. Enables processing multiple images simultaneously Each item contains all thumbnail metadata for downstream nodes 5. APImage API (HTTP Request) Calls the APImage service to remove backgrounds from images. API Endpoint: POST https://apimage.org/api/ai-remove-background Request Configuration: Header**: Authorization: Bearer YOUR_API_KEY Body**: image_url: {{ $json.originalThumbnailUrl }} ✅ Setup Required: Replace YOUR_API_KEY with your actual API key Get your key from APImage Dashboard 🡥 6. Download (HTTP Request) Downloads the processed image from APImage's servers using the returned URL. Fetches the background-removed image file Prepares image data for upload to storage 7. Upload File (Google Drive) Saves processed images to your Google Drive in a "bg_removal" folder. Customization Options: Replace with Dropbox, OneDrive, AWS S3, or FTP upload Create date-based folder structures Use dynamic filenames with metadata Upload to multiple destinations simultaneously ✨ How To Get Started Set up APImage API: Double-click the APImage API node Replace YOUR_API_KEY with your actual API key Keep the Bearer prefix Configure Airtable: Ensure your Airtable has a table with image attachments Update field names in the Code node if different from defaults Test the workflow: Click the Remove Background trigger node Verify images are processed and uploaded successfully 🔗 Get your API Key 🡥 🔧 How to Customize Input Customization (Left Section) Replace the Airtable integration with any data source containing image URLs: Google Sheets** with product catalogs Notion** databases with image galleries Webhooks** from external systems File system** monitoring for new uploads Database** queries for image records Output Customization (Right Section) Modify where processed images are stored: Multiple Storage**: Upload to Google Drive + Dropbox simultaneously Database Updates**: Update original records with processed image URLs Email/Slack**: Send processed images via communication tools Website Integration**: Upload directly to WordPress, Shopify, etc. Processing Customization Batch Processing**: Limit concurrent API calls Quality Control**: Add image validation before/after processing Format Conversion**: Use Sharp node for resizing or format changes Metadata Preservation**: Extract and maintain EXIF data 📋 Workflow Connections Remove Background → Get a Record → Code → Split Out → APImage API → Download → Upload File 🎯 Perfect For E-commerce**: Batch process product photos for clean, professional listings Marketing Teams**: Remove backgrounds from brand assets and imagery Photographers**: Automate background removal for portrait sessions Content Creators**: Prepare images for presentations and social media Design Agencies**: Streamline asset preparation workflows 📚 Resources APImage API Documentation 🡥 Airtable API Reference 🡥 n8n Documentation 🡥 ⚡ Processing Speed: Handles multiple images in parallel for fast batch processing 🔒 Secure: API keys stored safely in n8n credentials 🔄 Reliable: Built-in error handling and retry mechanisms
by Neeraj Chouhan
Good to know: This workflow creates a WhatsApp chatbot that answers questions using your own PDFs through RAG (Retrieval-Augmented Generation). Every time you upload a document to Google Drive, it is processed into embeddings and stored in Pinecone—allowing the bot to respond with accurate, context-aware answers directly on WhatsApp. Who is this for? Anyone building a custom WhatsApp chatbot. Businesses wanting a private knowledge based assistant Teams that want their documents to be searchable via chat Creators/coaches who want automated Q&A from their PDFs Developers who want a no-code RAG pipeline using n8n What problem is this workflow solving? What this workflow does: ✅ Monitors a Google Drive folder for new PDFs ✅ Extracts and splits text into chunks ✅ Generates embeddings using OpenAI/Gemini ✅ Stores embeddings in a Pinecone vector index ✅ Receives user questions via WhatsApp ✅ Retrieves the most relevant info using vector search ✅ Generates a natural response using an AI Agent ✅ Sends the answer back to the user on WhatsApp How it works: 1️⃣ Google Drive Trigger detects a new or updated PDF 2️⃣ File is downloaded and its text is split into chunks 3️⃣ Embeddings are generated and stored in Pinecone 4️⃣ WhatsApp Trigger receives a user’s question 5️⃣ The question is embedded and matched with Pinecone 6️⃣ AI Agent uses retrieved context to generate a response 7️⃣ The message is delivered back to the user on WhatsApp How to use: Connect your Google Drive account Add your Pinecone API key and index name Add your OpenAI/Gemini API key Connect your WhatsApp trigger + sender nodes Upload a sample PDF to your Drive folder Send a test WhatsApp message to see the bot reply Requirements: ✅ n8n cloud or self-hosted ✅ Google Drive account ✅ Pinecone vector database ✅ OpenAI or Gemini API key ✅ WhatsApp integration (Cloud API or provider) Customizing this workflow: 🟢 Change the Drive folder or add file-type filters 🟢 Adjust chunk size or embedding model 🟢 Modify the AI prompt for tone, style, or restrictions 🟢 Add memory, logging, or analytics 🟢 Add multiple documents or delete old vector entries 🟢 Swap the AI model (OpenAI ↔ Gemini ↔ Groq, etc.)
by Toshiki Hirao
Managing invoices manually can be time-consuming and error-prone. This workflow automates the process by extracting key invoice details from PDFs shared in Slack, structuring the information with AI, saving it to Google Sheets, and sending a confirmation back to Slack. It’s a seamless way to keep your financial records organized without manual data entry. How it works Receive invoice in Slack – When a PDF invoice is uploaded to a designated Slack channel, the workflow is triggered. Fetch the PDF – The file is downloaded automatically for processing. Extract data from PDF – Basic text extraction is performed to capture invoice content. AI-powered invoice parsing – An AI model interprets the extracted text and structures essential fields such as company name, invoice number, total amount, invoice date, and due date. Save to Google Sheets – The structured invoice data is appended as a new row in a Google Sheet for easy tracking and reporting. Slack confirmation – A summary of the saved invoice details is sent back to Slack to notify the team. How to use Import the workflow into your n8n instance. Connect Slack – Authenticate your Slack account and set up the trigger channel where invoices will be uploaded. Connect Google Sheets – Authenticate with Google Sheets and specify the target spreadsheet and sheet name. Configure the AI extraction – Adjust the parsing prompt or output structure to fit your preferred data fields (e.g., vendor name, invoice ID, amount, dates). Test the workflow – Upload a sample invoice PDF in Slack and verify that the data is correctly extracted and saved to Google Sheets. Requirements An n8n instance (cloud) Slack account with permission to read uploaded files and post messages Google account with access to the spreadsheet you want to update AI integration (e.g., OpenAI GPT or another LLM with PDF parsing capabilities) A designated Slack channel for receiving invoice PDFs
by Avkash Kakdiya
How it works This workflow starts whenever a new domain is added to a Google Sheet. It cleans the domain, fetches traffic insights from SimilarWeb, extracts the most relevant metrics, and updates the sheet with enriched data. Optionally, it can also send this information to Airtable for further tracking or analysis. Step-by-step Trigger on New Domain Workflow starts when a new row is added in the Google Sheet. Captures the raw URL/domain entered by the user. Clean Domain URL Strips unnecessary parts like http://, https://, www., and trailing slashes. Stores a clean domain format (e.g., example.com) along with the row number. Fetch Website Analysis Uses the SimilarWeb API to pull traffic and engagement insights for the domain. Data includes global rank, country rank, category rank, total visits, bounce rate, and more. Extract Key Metrics Processes raw SimilarWeb data into a simplified structure. Extracted insights include: Ranks: Global, Country, and Category. Traffic Overview: Total Visits, Bounce Rate, Pages per Visit, Avg Visit Duration. Top Traffic Sources: Direct, Search, Social. Top Countries (Top 3): With traffic share percentages. Device Split: Mobile vs Desktop. Update Google Sheet Writes the cleaned and enriched domain data back into the same (or another) Google Sheet. Ensures each row is updated with the new traffic insights. Export to Airtable (Optional) Creates a new record in Airtable with the enriched traffic metrics. Useful if you want to manage or visualize company/domain data outside of Google Sheets. Why use this? Automatically enriches domain lists with live traffic data from SimilarWeb. Cleans messy URLs into a standard format. Saves hours of manual research on company traffic insights. Provides structured, comparable metrics for better decision-making. Flexible: update sheets, export to Airtable, or both.
by Takuya Ojima
Who’s it for Teams that monitor multiple news sources and want an automated, tagged, and prioritized briefing—PMM, PR/Comms, Sales/CS, founders, and research ops. What it does / How it works Each morning the workflow reads your RSS feeds, summarizes articles with an LLM, assigns tags from a maintained dictionary, saves structured records to Notion, and posts a concise Slack digest of top items. Core steps: Daily Morning Trigger → Workflow Configuration (Set) → Read RSS Feeds → Get Tag Dictionary → AI Summarizer and Tagger → Parse AI Output → Write to Notion Database → Sort by Priority → Top 3 Headlines → Format Slack Message → Post to Slack. How to set up Open Workflow Configuration (Set) and edit: rssFeeds (array of URLs), notionDatabaseId, slackChannel. Connect your own credentials in n8n for Notion, Slack, Google Sheets (if used for the tag dictionary), and your LLM provider. Adjust the trigger time in Daily Morning Trigger (e.g., weekdays at 09:00). Requirements n8n (Cloud or self-hosted) Slack app with chat:write to the target channel Notion database with properties: summary (rich_text), tags (multi_select), priority (number), url (url), publishedDate (date) Optional Google Sheet for the tag dictionary (or replace with another source) How to customize the workflow Scoring & selection: Change priority rules, increase “Top N” items, or sort by recency. Taxonomy: Extend the tag dictionary; refine the AI prompt for stricter tagging. Outputs: Post per-tag Slack threads, send DMs, or create Notion relations to initiatives. Sources: Add more feeds or mix in APIs/newsletters. Security: Do not hardcode API keys in HTTP nodes; keep credentials in n8n.
by Paolo Ronco
Amazon Luna Prime Games Catalog Tracker (Auto-Sync to Google Sheets)** Automatically fetch, organize, and maintain an updated catalog of Amazon Luna – Included with Prime games.This workflow regularly queries Amazon’s official Luna endpoint, extracts complete metadata, and syncs everything into Google Sheets without duplicates. Ideal for: tracking monthly Prime Luna rotations keeping a personal archive of games monitoring new games appearing on Amazon Games / Prime Gaming, so you can instantly play titles you’re interested in building dashboards or gaming databases powering notification systems (Discord, Telegram, email, etc.) Overview Amazon Luna’s “Included with Prime” lineup changes frequently, with new games added and old ones removed.Instead of checking manually, this n8n template fully automates the process: Fetches the latest list from Amazon’s backend Extracts detailed metadata from the response Syncs the data into Google Sheets Avoids duplicates by updating existing rows Supports all major Amazon regions Once configured, it runs automatically—keeping your game catalog correct, clean, and always up to date. 🛠️ How the workflow works 1. Scheduled Trigger Starts the workflow on a set schedule (default: every 5 days at 3:00 PM).You can change both frequency and time freely. 2. HTTP Request to Amazon Luna Calls Amazon Luna’s regional endpoint and retrieves the full “Included with Prime” catalog. 3. JavaScript Code Node – Data Extraction Parses the JSON response and extracts structured fields: Title Genres Release Year ASIN Image URLs Additional metadata The result is a clean, ready-to-use dataset. 4. Google Sheets – Insert or Update Rows Each game is written into the selected Google Sheet: Existing games get updated New games are appended The Title acts as the unique identifier to prevent duplicates. ## ⚙️ Configuration Parameters | Parameter | Description | Recommended values | | --- | --- | --- | | x-amz-locale | Language + region | it_IT 🇮🇹 · en_US 🇺🇸 · de_DE 🇩🇪 · fr_FR 🇫🇷 · es_ES 🇪🇸 · en_GB 🇬🇧 · ja_JP 🇯🇵 · en_CA 🇨🇦 | | x-amz-marketplace-id | Marketplace backend ID | APJ6JRA9NG5V4 🇮🇹 · ATVPDKIKX0DER 🇺🇸 · A1PA6795UKMFR9 🇩🇪 · A13V1IB3VIYZZH 🇫🇷 · A1RKKUPIHCS9HS 🇪🇸 · A1F83G8C2ARO7P 🇬🇧 · A1VC38T7YXB528 🇯🇵 · A2EUQ1WTGCTBG2 🇨🇦 | | Accept-Language | Response language | Example: it-IT,it;q=0.9,en;q=0.8 | | User-Agent | Browser-like request | Default or updated UA | | Trigger interval | Refresh frequency | Every 5 days at 3:00 PM (modifiable) | | Google Sheet | Storage output | Select your file + sheet | You can adapt these headers to fetch data from any supported country. 💡 Tips & Customization 🌍 Regional catalogs Duplicate the HTTP Request + Code + Sheet block to track multiple countries (US, DE, JP, UK…). 🧹 No duplicates The workflow updates rows intelligently, ensuring a clean catalog even after many runs. 🗂️ Move data anywhere Send the output to: Airtable Databases (MySQL, Postgres, MongoDB…) Notion CSV REST APIs BI dashboards 🔔 Add notifications (Discord, Telegram, Email, etc.) You can pair this template with a notification workflow.When used with Discord, the notification message can include: game title description or metadata the game’s image**, automatically downloaded and attached This makes notifications visually informative and perfect for tracking new Prime titles. 🔒 Important Notes All retrieved data belongs to Amazon. The workflow is intended for personal, testing, or educational use only. Do not republish or redistribute collected data without permission.
by Fei Wu
Reddit Post Saver & Summarizer with AI-Powered Filtering Who This Is For Perfect for content curators, researchers, developers, and community managers who want to build a structured database of valuable Reddit content without manual data entry. If you're tracking industry trends, gathering user feedback, or building a knowledge base from Reddit discussions, this workflow automates the entire process. The Problem It Solves Reddit has incredible discussions, but manually copying posts, extracting insights, and organizing them into a database is time-consuming. This workflow automatically transforms your saved Reddit posts into structured, searchable data—complete with AI-generated summaries of both the post and its comment section. How It Works 1. Save Posts Manually Simply use Reddit's built-in save feature on any post you find valuable. 2. Automated Daily Processing The workflow triggers once per day and: Fetches all your saved Reddit posts via Reddit API Filters posts by subreddit and custom conditions (e.g., "only posts about JavaScript frameworks" or "posts with more than 100 upvotes") Uses an LLM (Google Gemini) to verify posts match your natural language criteria Generates comprehensive summaries of both the original post and top comments 3. Structured Database Storage Filtered and summarized posts are automatically saved to your Supabase database with this structure: { "reddit_id": "unique post identifier", "title": "post title", "url": "direct link to Reddit post", "summary": "AI-generated summary of post and comments", "tags": ["array", "of", "relevant", "tags"], "post_date": "original post creation date", "upvotes": "number of upvotes", "num_comments": "total comment count" } Setup Requirements Reddit API credentials** (client ID and secret) Supabase account** with a database table Google Gemini API key** (or alternative LLM provider) Basic configuration of filter conditions (subreddit names and natural language criteria) Use Cases Product Research**: Track competitor mentions and feature requests Content Creation**: Build a library of trending topics in your niche Community Management**: Monitor feedback across multiple subreddits Academic Research**: Collect and analyze discussions on specific topics
by DIGITAL BIZ TECH
AI Product Catalog Chatbot with Google Drive Ingestion & Supabase RAG Overview This workflow builds a dual-system that connects automated document ingestion with a live product catalog chatbot powered by Mistral AI and Supabase. It includes: Ingestion Pipeline:** Automatically fetches JSON files from Google Drive, processes their content, and stores vector embeddings in Supabase. Chatbot:** An AI agent that queries the Supabase vector store (RAG) to answer user questions about the product catalog. It uses Mistral AI for chat intelligence and embeddings, and Supabase for vector storage and semantic product search. Chatbot Flow Trigger:** When chat message received or Webhook (from live website) Model:** Mistral Cloud Chat Model (mistral-medium-latest) Memory:** Simple Memory (Buffer Window) — keeps last 15 messages for conversational context Vector Search Tool:** Supabase Vector Store Embeddings:** Mistral Cloud Agent:** product catalog agent Responds to user queries using the products table in Supabase. Searches vectors for relevant items and returns structured product details (name, specs, images, and links). Maintains chat session history for natural follow-up questions. Document → Knowledge Base Pipeline Triggered manually (Execute workflow) to populate or refresh the Supabase vector store. Steps Google Drive (List Files) → Fetch all files from the configured Google Drive folder. Loop Over Items → For each file: Google Drive (Get File) → Download the JSON document. Extract from File → Parse and read raw JSON content. Map Data into Fields (Set node) → Clean and normalize JSON keys (e.g., page_title, comprehensive_summary, key_topics). Convert Data into Chunks (Code node) → Merge text fields like summary and markdown. → Split content into overlapping 2,000-character chunks. → Add metadata such as title, URL, and chunk index. Embeddings (Mistral Cloud) → Generate vector embeddings for each text chunk. Insert into Supabase Vectorstore → Save chunks + embeddings into the website_mark table. Wait → Pause for 30 seconds before the next file to respect rate limits. Integrations Used | Service | Purpose | Credential | |----------|----------|------------| | Google Drive | File source for catalog JSON documents | Google Drive account dbt | | Mistral AI | Chat model & embeddings | Mistral Cloud account dbt | | Supabase | Vector storage & RAG search | Supabase DB account dbt | | Webhook / Chat | User-facing interface for chatbot | Website or Webhook | Sample JSON Data Format (for Ingestion) The ingestion pipeline expects structured JSON product files, which can include different categories such as Apparel or Tools. Apparel Example (T-Shirts) [ { "Name": "Classic Crewneck T-Shirt", "Item Number": "A-TSH-NVY-M", "Image URL": "https://www.example.com/images/tshirt-navy.jpg", "Image Markdown": "", "Size Chart URL": "https://www.example.com/charts/tshirt-sizing", "Materials": "100% Pima Cotton", "Color": "Navy Blue", "Size": "M", "Fit": "Regular Fit", "Collection": "Core Essentials" } ] Tools Example (Drill Bits) [ { "Name": "Titanium Drill Bit, 1/4\"", "Item Number": "T-DB-TIN-250", "Image URL": "https://www.example.com/images/drill-bit-1-4.jpg", "Image Markdown": "", "Spec Sheet URL": "https://www.example.com/specs/T-DB-TIN-250", "Materials": "HSS with Titanium Coating", "Type": "Twist Drill Bit", "Size (in)": "1/4", "Shank Type": "Hex", "Application": "Metal, Wood, Plastic" } ] Agent System Prompt Summary > “You are an AI product catalog assistant. Use only the Supabase vector database as your knowledge base. Provide accurate, structured responses with clear formatting — including product names, attributes, and URLs. If data is unavailable, reply politely: ‘I couldn’t find that product in the catalog.’” Key Features Automated JSON ingestion from Google Drive → Supabase Intelligent text chunking and metadata mapping Dual-workflow architecture (Ingestion + Chatbot) Live conversational product search via RAG Supports both embedded chat and webhook channels Summary > A powerful end-to-end workflow that transforms your product data into a searchable, AI-ready knowledge base, enabling real-time product Q&A through a Mistral-powered chatbot. Perfect for eCommerce teams, distributors, or B2B companies managing large product catalogs. Need Help or More Workflows? Want to customize this workflow for your business or integrate it with your tools? Our team at Digital Biz Tech can tailor it precisely to your use case — from automation pipelines to AI-powered product discovery. 💡 We can help you set it up for free — from connecting credentials to deploying it live. Contact: shilpa.raju@digitalbiz.tech Website: https://www.digitalbiz.tech LinkedIn: https://www.linkedin.com/company/digital-biz-tech/ You can also DM us on LinkedIn for any help.