by Paolo Ronco
Amazon Luna Prime Games Catalog Tracker (Auto-Sync to Google Sheets)** Automatically fetch, organize, and maintain an updated catalog of Amazon Luna – Included with Prime games.This workflow regularly queries Amazon’s official Luna endpoint, extracts complete metadata, and syncs everything into Google Sheets without duplicates. Ideal for: tracking monthly Prime Luna rotations keeping a personal archive of games monitoring new games appearing on Amazon Games / Prime Gaming, so you can instantly play titles you’re interested in building dashboards or gaming databases powering notification systems (Discord, Telegram, email, etc.) Overview Amazon Luna’s “Included with Prime” lineup changes frequently, with new games added and old ones removed.Instead of checking manually, this n8n template fully automates the process: Fetches the latest list from Amazon’s backend Extracts detailed metadata from the response Syncs the data into Google Sheets Avoids duplicates by updating existing rows Supports all major Amazon regions Once configured, it runs automatically—keeping your game catalog correct, clean, and always up to date. 🛠️ How the workflow works 1. Scheduled Trigger Starts the workflow on a set schedule (default: every 5 days at 3:00 PM).You can change both frequency and time freely. 2. HTTP Request to Amazon Luna Calls Amazon Luna’s regional endpoint and retrieves the full “Included with Prime” catalog. 3. JavaScript Code Node – Data Extraction Parses the JSON response and extracts structured fields: Title Genres Release Year ASIN Image URLs Additional metadata The result is a clean, ready-to-use dataset. 4. Google Sheets – Insert or Update Rows Each game is written into the selected Google Sheet: Existing games get updated New games are appended The Title acts as the unique identifier to prevent duplicates. ## ⚙️ Configuration Parameters | Parameter | Description | Recommended values | | --- | --- | --- | | x-amz-locale | Language + region | it_IT 🇮🇹 · en_US 🇺🇸 · de_DE 🇩🇪 · fr_FR 🇫🇷 · es_ES 🇪🇸 · en_GB 🇬🇧 · ja_JP 🇯🇵 · en_CA 🇨🇦 | | x-amz-marketplace-id | Marketplace backend ID | APJ6JRA9NG5V4 🇮🇹 · ATVPDKIKX0DER 🇺🇸 · A1PA6795UKMFR9 🇩🇪 · A13V1IB3VIYZZH 🇫🇷 · A1RKKUPIHCS9HS 🇪🇸 · A1F83G8C2ARO7P 🇬🇧 · A1VC38T7YXB528 🇯🇵 · A2EUQ1WTGCTBG2 🇨🇦 | | Accept-Language | Response language | Example: it-IT,it;q=0.9,en;q=0.8 | | User-Agent | Browser-like request | Default or updated UA | | Trigger interval | Refresh frequency | Every 5 days at 3:00 PM (modifiable) | | Google Sheet | Storage output | Select your file + sheet | You can adapt these headers to fetch data from any supported country. 💡 Tips & Customization 🌍 Regional catalogs Duplicate the HTTP Request + Code + Sheet block to track multiple countries (US, DE, JP, UK…). 🧹 No duplicates The workflow updates rows intelligently, ensuring a clean catalog even after many runs. 🗂️ Move data anywhere Send the output to: Airtable Databases (MySQL, Postgres, MongoDB…) Notion CSV REST APIs BI dashboards 🔔 Add notifications (Discord, Telegram, Email, etc.) You can pair this template with a notification workflow.When used with Discord, the notification message can include: game title description or metadata the game’s image**, automatically downloaded and attached This makes notifications visually informative and perfect for tracking new Prime titles. 🔒 Important Notes All retrieved data belongs to Amazon. The workflow is intended for personal, testing, or educational use only. Do not republish or redistribute collected data without permission.
by Mychel Garzon
Find businesses with strong reputations but weak digital presence You know the businesses that need your services, but finding them is the hard part. They have 150+ five-star reviews, customers raving about specific services, and zero way to book online. They exist, they're profitable, and they don't know they're leaving money on the table. This workflow hunts them down automatically. Every Monday morning, it searches directory sites, scrapes websites, analyzes competitors, and generates a ranked list of high-opportunity leads with custom sales pitches already written. You get businesses to contact, not raw data to interpret. This is not a "scrape Google Maps for emails" template. It finds the mismatch between reputation and digital capability, calculates exactly why each business is an opportunity, and writes the pitch hook based on what their own customers are saying. Your outreach becomes specific, not generic. How the Workflow Works The workflow runs in ten stages: 1. Category signal generation: AI analyzes your target category and location to determine seasonal peak months and localized conversion keywords (e.g., "varaa aika" for Finnish bookings). Language detection adapts the search terms automatically. 2. Directory discovery: Firecrawl searches Yelp, Google Maps, and TripAdvisor for businesses matching your criteria. Returns top 8 directory profiles with ratings, reviews, and basic info. 3. Business extraction: AI parses the search results and selects exactly 3 high-potential leads based on review count, ratings, and signals of digital weakness (no website listed, basic directory presence only). 4. Deduplication check: Queries the n8n database to filter out businesses already processed in previous runs. Only new leads proceed to avoid duplicate API costs. 5. Website scraping (with rate limiting): Each new lead enters a processing loop with a 2-second delay. Firecrawl attempts to scrape their actual website. If the site exists, captures full markdown content plus screenshot. If no site found, marks as "missing website" and continues. 6. Competitor discovery and analysis: For each lead, searches for 3 nearby competitors in the same category, excluding directory sites. Scrapes competitor websites to compare digital sophistication (booking systems, mobile optimization, conversion paths). 7. Review mining: Searches for recent customer reviews mentioning the business name. Extracts snippets for sentiment analysis. 8. AI service extraction: Analyzes review text to identify specific services customers are praising (e.g., "best balayage," "amazing brunch," "professional brake repair"). These become the foundation for revenue leak detection. 9. Mismatch calculation engine: Scores each business on two axes. Reputation score (0-100) based on review count and rating average. Digital score (0-100) based on HTTPS presence, mobile viewport, copyright freshness, conversion path detection, and CMS quality. Mismatch score = gap between the two. Higher gap = bigger opportunity. 10. Revenue leak detection and pitch generation: Compares services praised in reviews against services mentioned on the website. If customers rave about Service X but the website never mentions it, flags as "revenue leak." Generates a custom sales pitch based on the primary gap detected (revenue leak, missing conversion, seasonal timing, or competitor pressure). 11. Aggregation and reporting: Ranks all processed leads by opportunity score. Saves top leads to n8n database. Generates a professional HTML report with dark theme design, gradient headers, progress bar visualizations, and embedded website screenshots. Report includes observed facts, heuristic assessments, and exact pitch hooks for each lead. Benefits Finds businesses ready to buy, not just browsing: The dual-scoring system means you're not contacting everyone with an old website. You're finding businesses where the reputation-to-digital gap creates urgency. Revenue leak detection writes your pitch: When the workflow finds "customers praise Service X but your website doesn't mention it," that becomes your opening line. Specific beats generic every time. Stops wasted API calls with deduplication: Every business name checked against the existing database before processing. You never pay to analyze the same lead twice. Professional client-ready reports: The HTML output looks like something a market research firm would charge $500 for. Dark gradient design, progress bars, screenshot embeds, and color-coded risk levels. Seasonal timing creates urgency: The workflow calculates weeks until peak season for each category. "Your busy season starts in 8 weeks" hits different than "you should improve your website." Competitor pressure backs up your claim: Shows exactly how many nearby competitors have booking systems this business lacks. Social proof without naming names. Target Audience Web design agencies prospecting local businesses Digital marketing consultants finding clients with conversion gaps SEO specialists identifying visibility opportunities Business brokers locating digitally underperforming companies SaaS companies selling booking/scheduling tools to local services Freelance developers building websites for brick-and-mortar businesses Required APIs Firecrawl API** web scraping and directory search. Groq API** LLM analysis n8n Database** built-in, no external service needed Easy Customization Change target markets: Open the Configuration node and modify two fields: location (any city/region) and category (any local business type). The AI adapts seasonal analysis and search terms automatically. Adjust lead quantity: The "Business extraction" LLM prompt requests exactly 3 leads. Change the number in the prompt to get 1-5 leads per run. More leads = higher API costs but faster database growth. Modify scoring weights: Open "The Mismatch Engine" code node and adjust the scoring formula. Current weights: HTTPS (20 points), mobile viewport (20), conversion path (30), CMS quality (15), copyright freshness (15). Change these to prioritize different factors. Customize HTML report design: The "Build HTML Report" node contains the full template. Change the gradient colors, adjust card layouts, swap fonts, or add your agency logo to the header. Slack notifications: Connect your Slack workspace, choose a channel, and new leads will post automatically with opportunity scores and pitch hooks. Perfect for sales teams who live in Slack instead of email. Extend document limits: The workflow truncates website text at reasonable limits. Modify the extraction code if you need fuller content analysis for enterprise-level businesses. Add more competitor depth: Current setup scrapes 3 competitors per lead. Increase this in the "Filter Competitors" code node, but watch your Firecrawl usage costs. Create custom pitch templates: The pitch generation logic in "The Mismatch Engine" node has five angle types. Add more templates for industry-specific approaches (e.g., healthcare compliance, restaurant food safety).
by Welat Eren
What this workflow does This workflow turns your Spotify listening history into vocabulary flashcards for language learning. When you listen to music in your target language, you hear words hundreds of times without trying, they're already in your subconscious. This workflow extracts those words and turns them into flashcards, so you're connecting meaning to sounds you already know. Every Sunday, the workflow: Fetches your recently played songs from Spotify Finds lyrics via lrclib.net Extracts 40-60 useful words (B1-B2 level) using Google Gemini Deduplicates against all previously learned words Writes new vocabulary to Google Sheets (master + weekly tab) Review with the free Flashcard Lab app (iOS + Android) which reads directly from Google Sheets with spaced repetition. Works for any language, just change the AI prompt. Setup Setup time: ~15 minutes Google Cloud Console: Create project, enable Sheets API + Drive API, create OAuth2 credentials, set app to "In Production" (tokens expire after 7 days in Testing mode) Gemini API: Get a free key from Google AI Studio Spotify Developer Dashboard: Create app, note Client ID + Secret Google Sheet: Create a sheet with tab "All Vocabularies" and headers Word + Translation Import workflow, connect credentials, select your Google Sheet in all Sheets nodes Click "Execute Workflow" to test Enable the schedule trigger for weekly runs Changing the language Edit the prompt in "Prepare all Lyrics into Pairs" — that's the only place you need to change. All other nodes use generic Word and Translation columns. > 🎧 Tip: Listen to music in the language you're learning. The whole point is that your brain already absorbed these words passively — the flashcards connect meaning to sounds you already know. Full documentation 👉 GitHub Repository
by Fei Wu
Reddit Post Saver & Summarizer with AI-Powered Filtering Who This Is For Perfect for content curators, researchers, developers, and community managers who want to build a structured database of valuable Reddit content without manual data entry. If you're tracking industry trends, gathering user feedback, or building a knowledge base from Reddit discussions, this workflow automates the entire process. The Problem It Solves Reddit has incredible discussions, but manually copying posts, extracting insights, and organizing them into a database is time-consuming. This workflow automatically transforms your saved Reddit posts into structured, searchable data—complete with AI-generated summaries of both the post and its comment section. How It Works 1. Save Posts Manually Simply use Reddit's built-in save feature on any post you find valuable. 2. Automated Daily Processing The workflow triggers once per day and: Fetches all your saved Reddit posts via Reddit API Filters posts by subreddit and custom conditions (e.g., "only posts about JavaScript frameworks" or "posts with more than 100 upvotes") Uses an LLM (Google Gemini) to verify posts match your natural language criteria Generates comprehensive summaries of both the original post and top comments 3. Structured Database Storage Filtered and summarized posts are automatically saved to your Supabase database with this structure: { "reddit_id": "unique post identifier", "title": "post title", "url": "direct link to Reddit post", "summary": "AI-generated summary of post and comments", "tags": ["array", "of", "relevant", "tags"], "post_date": "original post creation date", "upvotes": "number of upvotes", "num_comments": "total comment count" } Setup Requirements Reddit API credentials** (client ID and secret) Supabase account** with a database table Google Gemini API key** (or alternative LLM provider) Basic configuration of filter conditions (subreddit names and natural language criteria) Use Cases Product Research**: Track competitor mentions and feature requests Content Creation**: Build a library of trending topics in your niche Community Management**: Monitor feedback across multiple subreddits Academic Research**: Collect and analyze discussions on specific topics
by Rahul Shah
Who is this for? Natural gas traders, energy analysts, LNG desk professionals, utility planners, industrial gas buyers, power generation schedulers, pipeline operations teams, commodity research desks, and macro researchers tracking the NYMEX Henry Hub benchmark. If you start your trading day asking "where is Henry Hub spot and what does the forward curve look like?", this workflow was built for you. Anyone who wants a hands-free, multi-timeframe view of the U.S. natural gas market delivered to Telegram without paying for a data subscription will find immediate value here. What problem does this solve? Henry Hub is the pricing reference for virtually every natural gas contract traded in North America and a key driver of LNG export pricing globally. Keeping track of both the live spot price and the 12-month forward curve throughout the trading day normally requires a professional terminal or a paid commodity data API — neither of which is cheap. Bloomberg costs thousands per month. Refinitiv, CME DataMine, and similar platforms are priced for institutional desks. This workflow eliminates that cost entirely. It silently scrapes the public Henry Hub futures page on oilprice.com, extracts the live spot price plus the next 12 monthly forward contracts along with multi-timeframe price change data (5-day, 30-day, 90-day, 1-year, and YTD), formats everything into a clean tabular Telegram message, and delivers it on a configurable weekday schedule. You get institutional-grade forward curve visibility without spending a single rupee on a data feed. Zero ongoing cost to run This workflow has no API keys, no subscriptions, no metered usage, and zero ongoing maintenance requirements. It relies entirely on: Your own n8n instance (self-hosted = free, Cloud = your existing plan) A free Telegram bot (created via BotFather in under 3 minutes) Public HTML scraping of oilprice.com (no authentication, no rate limits you will realistically hit in normal use) Compare this to competing data sources: premium natural gas data APIs and terminal subscriptions routinely run from 50 to several thousand dollars per month. This workflow costs exactly 0 dollars per month to run indefinitely. The only investment is the 8-minute one-time setup. How it works A schedule trigger fires 6 times on weekdays at 10 AM, 12 PM, 2 PM, 4 PM, 6:19 PM, and 7:20 PM (IST) An HTTP request fetches the Henry Hub natural gas futures page from oilprice.com A JavaScript extractor pulls two datasets in a single pass: the live spot price with intraday change, and all available forward futures contracts For each contract it captures last price, price change, change direction, open, high, low, date, and historical reference prices across 5-day, 30-day, 90-day, 1-year, and YTD windows — computing percentage changes for each timeframe A message builder separates the spot price from the futures array, formats the first 12 monthly contracts into a clean HTML-formatted Telegram table, and calculates strip range and average The Telegram node delivers the full update to your chat with source attribution and an IST timestamp Setup steps Setup takes around 8 to 10 minutes. You will need: A Telegram bot (free, created via @BotFather) Your numeric Telegram chat ID (captured via @userinfobot) Complete step-by-step setup instructions, schedule customization guidance, and timezone adjustment tips are included as sticky notes directly on the workflow canvas. Import the workflow, read the yellow setup sticky note, and follow along. No guesswork required. Customization ideas Different schedule:** Edit the Schedule Trigger to match your trading hours. Pre-market, market open, mid-session, market close, and after-hours is a common six-slot cadence. Different timezone:** Update the workflow settings timezone from Asia/Kolkata to your own. The schedule recalibrates automatically and the IST timestamp in the Telegram message updates accordingly. Track more or fewer contracts:** The parser extracts all available forward months from the page. The message builder uses the first 12. Change slice(0, 12) in the Message Builder node to include more or fewer contracts. Add price alerts:** Insert an IF node between the extractor and message builder to send a Telegram message only when spot price moves more than a defined threshold intraday. Ideal for traders who want to be notified only on significant moves, not routine updates. Multi-timeframe drill-down:** The extractor already captures 5-day, 30-day, 90-day, 1-year, and YTD price references for every contract. Customize the message builder to display whichever timeframe combination is most relevant to your analysis style. Dual delivery:** Add a second branch after the message builder to simultaneously push the same data to email, Slack, Discord, or WhatsApp. Log to Sheets:** Add a Google Sheets node to append every scrape to a spreadsheet for building your own historical Henry Hub forward curve dataset. Why oilprice.com? The natural gas futures page on oilprice.com renders all spot and futures data server-side in the HTML, which means reliable parsing without needing a headless browser or JavaScript execution environment. The data structure includes historical price references embedded directly in the HTML as data attributes, making it possible to calculate multi-timeframe percentage changes without any external API calls. The page structure has been stable for years. If the site structure does change, the parser code inside the Data Extraction node is thoroughly commented and straightforward to adapt. Important notes This workflow scrapes publicly available data from oilprice.com. Please be respectful in how frequently you run it. The default schedule of 6 runs per weekday is well within reasonable use. Avoid increasing this to minute-level frequency, as it is both unnecessary for most trading and analysis use cases and unfair to the source website.
by Rahul Shah
Who is this for? Oil traders, energy analysts, commodity research desks, shipping operations teams, refinery planners, equity investors in oil stocks, macro researchers, and anyone whose day starts with "where is Brent trading?" If you track the international oil benchmark and want hands-free price updates delivered to Telegram multiple times a day without paying for a Bloomberg terminal or a commodity data API, this workflow is built for you. What problem does this solve? Most professional oil price data sources charge steep monthly fees. Bloomberg terminals run into thousands per month. Refinitiv, CQG, and similar platforms are priced for institutions. Free alternatives exist but require manually checking websites throughout the day, which breaks focus and wastes time. This workflow silently scrapes the public Brent Crude futures page on oilprice.com, extracts the 10-month forward curve, formats the data into a clean tabular Telegram message, and delivers it on a configurable schedule. You get institutional-grade visibility into the Brent forward curve without spending a rupee on data feeds. Zero ongoing cost to run This workflow has no API keys, no subscriptions, and no metered usage. It relies entirely on: Your own n8n instance (self-hosted = free, Cloud = your existing plan) A free Telegram bot (created via BotFather in under 3 minutes) Public HTML scraping of oilprice.com (no authentication, no rate limit you will realistically hit) Compare this to competing data sources: premium oil data APIs routinely cost 50 to 500 dollars per month. This workflow costs exactly 0 dollars per month to run indefinitely. The only time investment is the 8-minute initial setup. How it works A schedule trigger fires 6 times on weekdays at 10 AM, 12 PM, 2 PM, 4 PM, 6:19 PM and 7:20 PM (IST) An HTTP request fetches the Brent Crude futures page from oilprice.com A JavaScript parser extracts the 10 forward month contracts (CBH26 through CBZ26, mapping to March through December) For each contract it captures last price, price change, and the website update timestamp An aggregate node bundles all contracts into a single array A builder formats the data into an HTML-formatted Telegram message with a pre-formatted price table, total range, and average change The Telegram node delivers the update to your chat Setup steps Setup takes around 8 to 10 minutes. You will need: A Telegram bot (free, created via @BotFather) Your numeric Telegram chat ID (captured via @userinfobot) Complete step-by-step setup instructions, schedule customization guidance, and timezone adjustment tips are included as sticky notes inside the workflow canvas. Import the workflow, read the yellow setup sticky note, and follow along. No guesswork required. Customization ideas Different schedule:** Edit the Schedule Trigger to match your trading hours. Pre-market, market open, mid-day, market close, and after-hours is a common six-slot cadence. Different timezone:** Update the workflow settings timezone from Asia/Kolkata to your own. The schedule will recalibrate automatically. Track different contracts:** The symbol list is defined in the parser code. Add or remove months, or extend to longer-dated contracts available on the page. Add price alerts:** Insert an IF node between the parser and Telegram to only send a message when price moves more than a threshold percentage. Perfect for traders who only want to be notified on meaningful moves. Dual delivery:** Add a second branch after the Message Builder to also send the same data to email, Slack, Discord, or WhatsApp. Log to Sheets:** Add a Google Sheets node to append every scrape to a spreadsheet for historical price tracking. Why oilprice.com? The source page renders all futures data server-side in the HTML, which means reliable parsing without needing a headless browser or JavaScript execution. The page structure has been stable for years, making the parser robust against minor site changes. If the site structure does change, the parser code is commented and easy to adapt. Important notes This workflow scrapes publicly available data from oilprice.com. Please be respectful in how frequently you run it. The default schedule of 6 runs per weekday is well within reasonable use. Avoid increasing this to minute-level frequency, as it is both unnecessary for most use cases and unfair to the source.
by vinci-king-01
Job Posting Aggregator with Email and GitHub ⚠️ COMMUNITY TEMPLATE DISCLAIMER: This is a community-contributed template that uses ScrapeGraphAI (a community node). Please ensure you have the ScrapeGraphAI community node installed in your n8n instance before using this template. This workflow automatically aggregates certification-related job-posting requirements from multiple industry sources, compares them against last year’s data stored in GitHub, and emails a concise change log to subscribed professionals. It streamlines annual requirement checks and renewal reminders, ensuring users never miss an update. Pre-conditions/Requirements Prerequisites n8n instance (self-hosted or n8n cloud) ScrapeGraphAI community node installed Git installed (for optional local testing of the repo) Working SMTP server or other Email credential supported by n8n Required Credentials ScrapeGraphAI API Key** – Enables web scraping of certification pages GitHub Personal Access Token** – Allows the workflow to read/write files in the repo Email / SMTP Credentials** – Sends the summary email to end-users Specific Setup Requirements | Resource | Purpose | Example | |----------|---------|---------| | GitHub Repository | Stores certification_requirements.json versioned annually | https://github.com/<you>/cert-requirements.git | | Watch List File | List of page URLs & selectors to scrape | Saved in the repo under /config/watchList.json | | Email List | Semicolon-separated list of recipients | me@company.com;team@company.com | How it works This workflow automatically aggregates certification-related job-posting requirements from multiple industry sources, compares them against last year’s data stored in GitHub, and emails a concise change log to subscribed professionals. It streamlines annual requirement checks and renewal reminders, ensuring users never miss an update. Key Steps: Manual Trigger**: Starts the workflow on demand or via scheduled cron. Load Watch List (Code Node)**: Reads the list of certification URLs and CSS selectors. Split In Batches**: Iterates through each URL to avoid rate limits. ScrapeGraphAI**: Scrapes requirement details from each page. Merge (Wait)**: Reassembles individual scrape results into a single JSON array. GitHub (Read File)**: Retrieves last year’s certification_requirements.json. IF (Change Detector)**: Compares current vs. previous JSON and decides whether changes exist. Email Send**: Composes and sends a formatted summary of changes. GitHub (Upsert File)**: Commits the new JSON file back to the repo for future comparisons. Set up steps Setup Time: 15-25 minutes Install Community Node: From n8n UI → Settings → Community Nodes → search and install “ScrapeGraphAI”. Create/Clone GitHub Repo: Add an empty certification_requirements.json ( {} ) and a config/watchList.json with an array of objects like: [ { "url": "https://cert-body.org/requirements", "selector": "#requirements" } ] Generate GitHub PAT: Scope repo, store in n8n Credentials as “GitHub API”. Add ScrapeGraphAI Credential: Paste your API key into n8n Credentials. Configure Email Credentials: E.g., SMTP with username/password or OAuth2. Open Workflow: Import the template JSON into n8n. Update Environment Variables (in the Code node or via n8n variables): GITHUB_REPO (e.g., user/cert-requirements) EMAIL_RECIPIENTS Test Run: Trigger manually. Verify email content and GitHub commit. Schedule: Add a Cron node (optional) for yearly or quarterly automatic runs. Node Descriptions Core Workflow Nodes: Manual Trigger** – Initiates the workflow manually or via external schedule. Code (Load Watch List)** – Reads and parses watchList.json from GitHub or static input. SplitInBatches** – Controls request concurrency to avoid scraping bans. ScrapeGraphAI** – Extracts requirement text using provided CSS selectors or XPath. Merge (Combine)** – Waits for all batches and merges them into one dataset. GitHub (Read/Write File)** – Handles version-controlled storage of JSON data. IF (Change Detector)** – Compares hashes/JSON diff to detect updates. EmailSend** – Sends change log, including renewal reminders and diff summary. Sticky Note** – Provides in-workflow documentation for future editors. Data Flow: Manual Trigger → Code (Load Watch List) → SplitInBatches SplitInBatches → ScrapeGraphAI → Merge Merge → GitHub (Read File) → IF (Change Detector) IF (True) → Email Send → GitHub (Upsert File) Customization Examples Adjusting Scraper Configuration // Inside the Watch List JSON object { "url": "https://new-association.com/cert-update", "selector": ".content article:nth-of-type(1) ul" } Custom Email Template // In Email Send node → HTML Content 📋 Certification Updates – {{ $json.date }} The following certifications have new requirements: {{ $json.diffHtml }} For full details visit our GitHub repo. Data Output Format The workflow outputs structured JSON data: { "timestamp": "2024-09-01T12:00:00Z", "source": "watchList.json", "current": { "AWS-SAA": "Version 3.0, requires renewed proctored exam", "PMP": "60 PDUs every 3 years" }, "previous": { "AWS-SAA": "Version 2.0", "PMP": "60 PDUs every 3 years" }, "changes": { "AWS-SAA": "Updated to Version 3.0; exam format changed." } } Troubleshooting Common Issues ScrapeGraphAI returns empty data – Check CSS/XPath selectors and ensure page is publicly accessible. GitHub authentication fails – Verify PAT scope includes repo and that the credential is linked in both GitHub nodes. Performance Tips Limit SplitInBatches size to 3-5 URLs when sources are heavy to avoid timeouts. Enable n8n execution mode “Queue” for long-running scrapes. Pro Tips: Store selector samples in comments next to each watch list entry for future maintenance. Use a Cron node set to “0 0 1 1 *” for an annual run exactly on Jan 1st. Add a Telegram node after Email Send for instant mobile notifications.
by vinci-king-01
Certification Requirement Tracker with Rocket.Chat and GitLab ⚠️ COMMUNITY TEMPLATE DISCLAIMER: This is a community-contributed template that uses ScrapeGraphAI (a community node). Please ensure you have the ScrapeGraphAI community node installed in your n8n instance before using this template. This workflow automatically monitors websites of certification bodies and industry associations, detects changes in certification requirements, commits the updated information to a GitLab repository, and notifies a Rocket.Chat channel. Ideal for professionals and compliance teams who must stay ahead of annual updates and renewal deadlines. Pre-conditions/Requirements Prerequisites Running n8n instance (self-hosted or n8n.cloud) ScrapeGraphAI community node installed and active Rocket.Chat workspace (self-hosted or cloud) GitLab account and repository for documentation Publicly reachable URL for incoming webhooks (use n8n tunnel, Ngrok, or a reverse proxy) Required Credentials ScrapeGraphAI API Key** – Enables scraping of certification pages Rocket.Chat Access Token & Server URL** – To post update messages GitLab Personal Access Token** – With api and write_repository scopes Specific Setup Requirements | Item | Example Value | Notes | | ------------------------------ | ------------------------------------------ | ----- | | GitLab Repo | gitlab.com/company/cert-tracker | Markdown files will be committed here | | Rocket.Chat Channel | #certification-updates | Receives update alerts | | Certification Source URLs file | /data/sourceList.json in the repository | List of URLs to scrape | How it works This workflow automatically monitors websites of certification bodies and industry associations, detects changes in certification requirements, commits the updated information to a GitLab repository, and notifies a Rocket.Chat channel. Ideal for professionals and compliance teams who must stay ahead of annual updates and renewal deadlines. Key Steps: Webhook Trigger**: Fires on a scheduled HTTP call (e.g., via cron) or manual trigger. Code (Prepare Source List)**: Reads/constructs a list of certification URLs to scrape. ScrapeGraphAI**: Fetches HTML content and extracts requirement sections. Merge**: Combines newly scraped data with the last committed snapshot. IF Node**: Determines if a change occurred (hash/length comparison). GitLab**: Creates a branch, commits updated Markdown/JSON files, and opens an MR (optional). Rocket.Chat**: Posts a message summarizing changes and linking to the GitLab diff. Respond to Webhook**: Returns a JSON summary to the requester (useful for monitoring or chained automations). Set up steps Setup Time: 20-30 minutes Install Community Node: In n8n UI, go to Settings → Community Nodes and install @n8n/community-node-scrapegraphai. Create Credentials: a. ScrapeGraphAI – paste your API key. b. Rocket.Chat – create a personal access token (Personal Access Tokens → New Token) and configure credentials. c. GitLab – create PAT with api + write_repository scopes and add to n8n. Clone the Template: Import this workflow JSON into your n8n instance. Edit StickyNote: Replace placeholder URLs with actual certification-source URLs or point to a repo file. Configure GitLab Node: Set your repository, default branch, and commit message template. Configure Rocket.Chat Node: Select credential, channel, and message template (markdown supported). Expose Webhook: If self-hosting, enable n8n tunnel or configure reverse proxy to make the webhook public. Test Run: Trigger the workflow manually; verify GitLab commit/MR and Rocket.Chat notification. Automate: Schedule an external cron (or n8n Cron node) to POST to the webhook yearly, quarterly, or monthly as needed. Node Descriptions Core Workflow Nodes: stickyNote** – Human-readable instructions/documentation embedded in the flow. webhook** – Entry point; accepts POST /cert-tracker requests. code (Prepare Source List)** – Generates an array of URLs; can pull from GitLab or an environment variable. scrapegraphAi** – Scrapes each URL and extracts certification requirement sections using CSS/XPath selectors. merge (by key)** – Joins new data with previous snapshot for change detection. if (Changes?)** – Branches logic based on whether differences exist. gitlab** – Creates/updates files and opens merge requests containing new requirements. rocketchat** – Sends formatted update to designated channel. respondToWebhook** – Returns 200 OK with a JSON summary. Data Flow: webhook → code → scrapegraphAi → merge → if if (true) → gitlab → rocketchat if (false) → respondToWebhook Customization Examples Change Scraping Frequency // Replace external cron with n8n Cron node { "nodes": [ { "name": "Cron", "type": "n8n-nodes-base.cron", "parameters": { "schedule": { "hour": "0", "minute": "0", "dayOfMonth": "1" } } } ] } Extend Notification Message // Rocket.Chat node → Message field const diffUrl = $json["gitlab_diff_url"]; const count = $json["changes_count"]; return :bell: ${count} Certification Requirement Update(s)\n\nView diff: ${diffUrl}; Data Output Format The workflow outputs structured JSON data: { "timestamp": "2024-05-15T12:00:00Z", "changesDetected": true, "changesCount": 3, "gitlab_commit_sha": "a1b2c3d4", "gitlab_diff_url": "https://gitlab.com/company/cert-tracker/-/merge_requests/42", "notifiedChannel": "#certification-updates" } Troubleshooting Common Issues ScrapeGraphAI returns empty results – Verify your CSS/XPath selectors and API key quota. GitLab commit fails (401 Unauthorized) – Ensure PAT has api and write_repository scopes and is not expired. Performance Tips Limit the number of pages scraped per run to avoid API rate limits. Cache last-scraped HTML in an S3 bucket or database to reduce redundant requests. Pro Tips: Use GitLab CI to auto-deploy documentation site whenever new certification files are merged. Enable Rocket.Chat threading to keep discussions organized per update. Tag stakeholders in Rocket.Chat messages with @cert-team for instant visibility.
by Onur
🏠 Extract Zillow Property Data to Google Sheets with Scrape.do This template requires a self-hosted n8n instance to run. A complete n8n automation that extracts property listing data from Zillow URLs using Scrape.do web scraping API, parses key property information, and saves structured results into Google Sheets for real estate analysis, market research, and property tracking. 📋 Overview This workflow provides a lightweight real estate data extraction solution that pulls property details from Zillow listings and organizes them into a structured spreadsheet. Ideal for real estate professionals, investors, market analysts, and property managers who need automated property data collection without manual effort. Who is this for? Real estate investors tracking properties Market analysts conducting property research Real estate agents monitoring listings Property managers organizing data Data analysts building real estate databases What problem does this workflow solve? Eliminates manual copy-paste from Zillow Processes multiple property URLs in bulk Extracts structured data (price, address, zestimate, etc.) Automates saving results into Google Sheets Ensures repeatable & consistent data collection ⚙️ What this workflow does Manual Trigger → Starts the workflow manually Read Zillow URLs from Google Sheets → Reads property URLs from a Google Sheet Scrape Zillow URL via Scrape.do → Fetches full HTML from Zillow (bypasses PerimeterX protection) Parse Zillow Data → Extracts structured property information from HTML Write Results to Google Sheets → Saves parsed data into a results sheet 📊 Output Data Points | Field | Description | Example | |-------|-------------|---------| | URL | Original Zillow listing URL | https://www.zillow.com/homedetails/... | | Price | Property listing price | $300,000 | | Address | Street address | 8926 Silver City | | City | City name | San Antonio | | State | State abbreviation | TX | | Days on Zillow | How long listed | 5 | | Zestimate | Zillow's estimated value | $297,800 | | Scraped At | Timestamp of extraction | 2025-01-29T12:00:00.000Z | ⚙️ Setup Prerequisites n8n instance (self-hosted) Google account with Sheets access Scrape.do account with API token (Get 1000 free credits/month) Google Sheet Structure This workflow uses one Google Sheet with two tabs: Input Tab: "Sheet1" | Column | Type | Description | Example | |--------|------|-------------|---------| | URLs | URL | Zillow listing URL | https://www.zillow.com/homedetails/123... | Output Tab: "Results" | Column | Type | Description | Example | |--------|------|-------------|---------| | URL | URL | Original listing URL | https://www.zillow.com/homedetails/... | | Price | Text | Property price | $300,000 | | Address | Text | Street address | 8926 Silver City | | City | Text | City name | San Antonio | | State | Text | State code | TX | | Days on Zillow | Number | Days listed | 5 | | Zestimate | Text | Estimated value | $297,800 | | Scraped At | Timestamp | When scraped | 2025-01-29T12:00:00.000Z | 🛠 Step-by-Step Setup Import Workflow: Copy the JSON → n8n → Workflows → + Add → Import from JSON Configure Scrape.do API: Sign up at Scrape.do Dashboard Get your API token In HTTP Request node, replace YOUR_SCRAPE_DO_TOKEN with your actual token The workflow uses super=true for premium residential proxies (10 credits per request) Configure Google Sheets: Create a new Google Sheet Add two tabs: "Sheet1" (input) and "Results" (output) In Sheet1, add header "URLs" in cell A1 Add Zillow URLs starting from A2 Set up Google Sheets OAuth2 credentials in n8n Replace YOUR_SPREADSHEET_ID with your actual Google Sheet ID Replace YOUR_GOOGLE_SHEETS_CREDENTIAL_ID with your credential ID Run & Test: Add 1-2 test Zillow URLs in Sheet1 Click "Execute workflow" Check results in Results tab 🧰 How to Customize Add more fields**: Extend parsing logic in "Parse Zillow Data" node to capture additional data (bedrooms, bathrooms, square footage) Filtering**: Add conditions to skip certain properties or price ranges Rate Limiting**: Insert a Wait node between requests if processing many URLs Error Handling**: Add error branches to handle failed scrapes gracefully Scheduling**: Replace Manual Trigger with Schedule Trigger for automated daily/weekly runs 📊 Use Cases Investment Analysis**: Track property prices and zestimates over time Market Research**: Analyze listing trends in specific neighborhoods Portfolio Management**: Monitor properties for sale in target areas Competitive Analysis**: Compare similar properties across locations Lead Generation**: Build databases of properties matching specific criteria 📈 Performance & Limits Single Property**: ~5-10 seconds per URL Batch of 10**: 1-2 minutes typical Large Sets (50+)**: 5-10 minutes depending on Scrape.do credits API Calls**: 1 Scrape.do request per URL (10 credits with super=true) Reliability**: 95%+ success rate with premium proxies 🧩 Troubleshooting | Problem | Solution | |---------|----------| | API error 400 | Check your Scrape.do token and credits | | URL showing "undefined" | Verify Google Sheet column name is "URLs" (capital U) | | No data parsed | Check if Zillow changed their HTML structure | | Permission denied | Re-authenticate Google Sheets OAuth2 in n8n | | 50000 character error | Verify Parse Zillow Data code is extracting fields, not returning raw HTML | | Price shows HTML/CSS | Update price extraction regex in Parse Zillow Data node | 🤝 Support & Community Scrape.do Documentation Scrape.do Dashboard Scrape.do Zillow Scraping Guide n8n Forum n8n Docs 🎯 Final Notes This workflow provides a repeatable foundation for extracting Zillow property data with Scrape.do and saving to Google Sheets. You can extend it with: Historical tracking (append timestamps) Price change alerts (compare with previous scrapes) Multi-platform scraping (Redfin, Realtor.com) Integration with CRM or reporting dashboards Important: Scrape.do handles all anti-bot bypassing (PerimeterX, CAPTCHAs) automatically with rotating residential proxies, so you only pay for successful requests. Always use super=true parameter for Zillow to ensure high success rates.
by Intuz
This n8n template from Intuz provides a complete solution to automate on-demand lead generation. It acts as a powerful scraping agent that takes a simple chat query, scours both Google Search and Google Maps for relevant businesses, scrapes their websites for contact details, and compiles an enriched lead list directly in Google Sheets. Who's this workflow for? Sales Development Representatives (SDRs) Local Marketing Agencies Business Development Teams Freelancers & Consultants Market Researchers How it works 1. Start with a Chat Query: The user initiates the workflow by typing a search query (e.g., "dentists in New York") into a chat interface. 2. Multi-Source Search: The workflow queries both the Google Custom Search API (for web results across multiple pages) and scrapes Google Maps (for local businesses) to gather a broad list of potential leads. 3. Deep Dive Website Scraping: For each unique business website found, the workflow visits the URL to scrape the raw HTML content of the page. 4. Intelligent Contact Extraction: Using custom code, it then parses the scraped website content to find and extract valuable contact information like email addresses, phone numbers, and social media links. 5. Deduplicate and Log to Sheets: Before saving, the workflow checks your Google Sheet to ensure the lead doesn't already exist. All unique, newly enriched leads are then appended as clean rows to your sheet, along with the original search query for tracking. Key Requirements to Use This Template 1. n8n Instance & Required Nodes: An active n8n account (Cloud or self-hosted). This workflow uses the official n8n LangChain integration (@n8n/n8n-nodes-langchain) for the chat trigger. If you are using a self-hosted version of n8n, please ensure this package is installed. 2. Google Custom Search API: A Google Cloud Project with the "Custom Search API" enabled. You will need an API Key for this service. You must also create a Programmable Search Engine and get its Search engine ID (cx). This tells Google what to search (e.g., the whole web). 3. Google Sheets Account: A Google account and a pre-made Google Sheet with columns for Business Name, Primary Email, Contact Number, URL, Description, Socials, and Search Query. Setup Instructions 1. Configure the Chat Trigger: In the "When chat message received" node, you can find the Direct URL or Embed code to use the chat interface. Set Up Google Custom Search API (Crucial Step): Go to the "Custom Google Search API" (HTTP Request) node. Under "Query Parameters", you must replace the placeholder values for key (with your API Key) and cx (with your Search Engine ID). 3. Configure Google Sheets: In all Google Sheets nodes (Append row in sheet, Get row(s) in sheet, etc.), connect your Google Sheets credentials. Select your target spreadsheet (Document ID) and the specific sheet (Sheet Name) where you want to store the leads. 4. Activate the Workflow: Save the workflow and toggle the "Active" switch to ON. Open the chat URL and enter a search query to start generating leads. Connect with us Website: https://www.intuz.com/services Email: getstarted@intuz.com LinkedIn: https://www.linkedin.com/company/intuz Get Started: https://n8n.partnerlinks.io/intuz For Custom Workflow Automation Click here- Get Started
by Onur
🔍 Extract Competitor SERP Rankings from Google Search to Sheets with Scrape.do This template requires a self-hosted n8n instance to run. A complete n8n automation that extracts competitor data from Google search results for specific keywords and target countries using Scrape.do SERP API, and saves structured results into Google Sheets for SEO, competitive analysis, and market research. 📋 Overview This workflow provides a lightweight competitor analysis solution that identifies ranking websites for chosen keywords across different countries. Ideal for SEO specialists, content strategists, and digital marketers who need structured SERP insights without manual effort. Who is this for? SEO professionals tracking keyword competitors Digital marketers conducting market analysis Content strategists planning based on SERP insights Business analysts researching competitor positioning Agencies automating SEO reporting What problem does this workflow solve? Eliminates manual SERP scraping Processes multiple keywords across countries Extracts structured data (position, title, URL, description) Automates saving results into Google Sheets Ensures repeatable & consistent methodology ⚙️ What this workflow does Manual Trigger → Starts the workflow manually Get Keywords from Sheet → Reads keywords + target countries from a Google Sheet URL Encode Keywords → Converts keywords into URL-safe format Process Keywords in Batches → Handles multiple keywords sequentially to avoid rate limits Fetch Google Search Results → Calls Scrape.do SERP API to retrieve raw HTML of Google SERPs Extract Competitor Data from HTML → Parses HTML into structured competitor data (top 10 results) Append Results to Sheet → Writes structured SERP results into a Google Sheet 📊 Output Data Points | Field | Description | Example | |--------------------|------------------------------------------|-------------------------------------------| | Keyword | Original search term | digital marketing services | | Target Country | 2-letter ISO code of target region | US | | position | Ranking position in search results | 1 | | websiteTitle | Page title from SERP result | Digital Marketing Software & Tools | | websiteUrl | Extracted website URL | https://www.hubspot.com/marketing | | websiteDescription | Snippet/description from search results | Grow your business with HubSpot’s tools… | ⚙️ Setup Prerequisites n8n instance (self-hosted) Google account with Sheets access Scrape.do* account with *SERP API token** Google Sheet Structure This workflow uses one Google Sheet with two tabs: Input Tab: "Keywords" | Column | Type | Description | Example | |----------|------|-------------|---------| | Keyword | Text | Search query | digital marketing | | Target Country | Text | 2-letter ISO code | US | Output Tab: "Results" | Column | Type | Description | Example | |--------------------|-------|-------------|---------| | Keyword | Text | Original search term | digital marketing | | position | Number| SERP ranking | 1 | | websiteTitle | Text | Title of the page | Digital Marketing Software & Tools | | websiteUrl | URL | Website/page URL | https://www.hubspot.com/marketing | | websiteDescription | Text | Snippet text | Grow your business with HubSpot’s tools | 🛠 Step-by-Step Setup Import Workflow: Copy the JSON → n8n → Workflows → + Add → Import from JSON Configure **Scrape.do API**: Endpoint: https://api.scrape.do/ Parameter: token=YOUR_SCRAPEDO_TOKEN Add render=true for full HTML rendering Configure Google Sheets: Create a sheet with two tabs: Keywords (input), Results (output) Set up Google Sheets OAuth2 credentials in n8n Replace placeholders: YOUR_GOOGLE_SHEET_ID and YOUR_GOOGLE_SHEETS_CREDENTIAL_ID Run & Test: Add test data in Keywords tab Execute workflow → Check results in Results tab 🧰 How to Customize Add more fields**: Extend HTML parsing logic in the “Extract Competitor Data” node to capture extra data (e.g., domain, sitelinks). Filtering**: Exclude domains or results with custom rules. Batch Size**: Adjust “Process Keywords in Batches” for speed vs. rate-limits. Rate Limiting: Insert a **Wait node (e.g., 10–30 seconds) if API rate limits apply. Multi-Sheet Output**: Save per-country or per-keyword results into separate tabs. 📊 Use Cases SEO Competitor Analysis**: Identify top-ranking sites for target keywords Market Research**: See how SERPs differ by region Content Strategy**: Analyze titles & descriptions of competitor pages Agency Reporting**: Automate competitor SERP snapshots for clients 📈 Performance & Limits Single Keyword: ~10–20 seconds (depends on **Scrape.do response) Batch of 10**: 3–5 minutes typical Large Sets (50+)**: 20–40 minutes depending on API credits & batching API Calls: 1 **Scrape.do request per keyword Reliability**: 95%+ extraction success, 98%+ data accuracy 🧩 Troubleshooting API error** → Check YOUR_SCRAPEDO_TOKEN and API credits No keywords loaded** → Verify Google Sheet ID & tab name = Keywords Permission denied** → Re-authenticate Google Sheets OAuth2 in n8n Empty results** → Check parsing logic and verify search term validity Workflow stops early** → Ensure batching loop (SplitInBatches) is properly connected 🤝 Support & Community n8n Forum: https://community.n8n.io n8n Docs: https://docs.n8n.io Scrape.do Dashboard: https://dashboard.scrape.do 🎯 Final Notes This workflow provides a repeatable foundation for extracting competitor SERP rankings with Scrape.do and saving them to Google Sheets. You can extend it with filtering, richer parsing, or integration with reporting dashboards to create a fully automated SEO intelligence pipeline.
by Alejandro Scuncia
An extendable RAG template to build powerful, explainable AI assistants — with query understanding, semantic metadata, and support for free-tier tools like Gemini, Gemma and Supabase. Description This workflow helps you build smart, production-ready RAG agents that go far beyond basic document Q&A. It includes: ✅ File ingestion and chunking ✅ Asynchronous LLM-powered enrichment ✅ Filterable metadata-based search ✅ Gemma-based query understanding and generation ✅ Cohere re-ranking ✅ Memory persistence via Postgres Everything is modular, low-cost, and designed to run even with free-tier LLMs and vector databases. Whether you want to build a chatbot, internal knowledge assistant, documentation search engine, or a filtered content explorer — this is your foundation. ⚙️ How It Works This workflow is divided into 3 pipelines: 📥 Ingestion Upload a PDF via form Extract text and chunk it for embedding Store in Supabase vector store using Google Gemini embeddings 🧠 Enrichment (Async) Scheduled task fetches new chunks Each chunk is enriched with LLM metadata (topics, use_case, risks, audience level, summary, etc.) Metadata is added to the vector DB for improved retrieval and filtering 🤖 Agent Chat A user question triggers the RAG agent Query Builder transforms it into keywords and filters Vector DB is queried and reranked The final answer is generated using only retrieved evidence, with references Chat memory is managed via Postgres 🌟 Key Features Asynchronous enrichment** → Save tokens, batch process with free-tier LLMs like Gemma Metadata-aware** → Improved filtering and reranking Explainable answers** → Agent cites sources and sections Chat memory** → Persistent context with Postgres Modular design** → Swap LLMs, rerankers, vector DBs, and even enrichment schema Free to run** → Built with Gemini, Gemma, Cohere, Supabase (free tier-compatible) 🔐 Required Credentials |Tool|Use| |-|-|-| |Supabase w/ PostreSQL|Vector DB + storage| |Google Gemini/Gemma|Embeddings & LLM| |Cohere API|Re-ranking| |PostgreSQL|Chat memory| 🧰 Customization Tips Swap extractFromFile with Notion/Google Drive integrations Extend Metadata Obtention prompt to fit your domain (e.g., financial, legal) Replace LLMs with OpenAI, Mistral, or Ollama Replace Postgre Chat Memory with Simple Memory or any other Use a webhook instead of a form to automate ingestion Connect to Telegram/Slack UI with a few extra nodes 💡 Use Cases Company knowledge base bot (internal docs, SOPs) Educational assistant with smart filtering (by topic or level) Legal or policy assistant that cites source sections Product documentation Q&A with multi-language support Training material assistant that highlights risks/examples Content Generation 🧠 Who It’s For Indie developers building smart chatbots AI consultants prototyping Q&A assistants Teams looking for an internal knowledge agent Anyone building affordable, explainable AI tools 🚀 Try It Out! Deploy a modular RAG assistant using n8n, Supabase, and Gemini — fully customizable and almost free to run. 1. 📁 Prepare Your PDFs Use any internal documents, manuals, or reports in *PDF *format. Optional: Add Google Drive integration to automate ingestion. 2. 🧩 Set Up Supabase Create a free Supabase project Use the table creation queries included in the workflow to set up your schema. Add your *supabaseUrl *and *supabaseKey *in your n8n credentials. > 💡 Pro Tip: Make sure you match the embedding dimensions to your model. This workflow uses Gemini text-embedding-04 (768-dim) — if switching to OpenAI, change your table vector size to 1536. 3. 🧠 Connect Gemini & Gemma Use Gemini/Gemma for embeddings and optional metadata enrichment. Or deploy locally for lightweight async LLM processing (via Ollama/HuggingFace). 4. ⚙️ Import the Workflow in n8n Open n8n (self-hosted or cloud). Import the workflow file and paste your credentials. You’re ready to ingest, enrich, and query your document base. 💬 Have Feedback or Ideas? I’d Love to Hear This project is open, modular, and evolving — just like great workflows should be :). If you’ve tried it, built on top of it, or have suggestions for improvement, I’d genuinely love to hear from you. Let’s share ideas, collaborate, or just connect as part of the n8n builder community. 📧 ascuncia.es@gmail.com 🔗 Linkedin