by Avkash Kakdiya
How it works This workflow starts whenever a new domain is added to a Google Sheet. It cleans the domain, fetches traffic insights from SimilarWeb, extracts the most relevant metrics, and updates the sheet with enriched data. Optionally, it can also send this information to Airtable for further tracking or analysis. Step-by-step Trigger on New Domain Workflow starts when a new row is added in the Google Sheet. Captures the raw URL/domain entered by the user. Clean Domain URL Strips unnecessary parts like http://, https://, www., and trailing slashes. Stores a clean domain format (e.g., example.com) along with the row number. Fetch Website Analysis Uses the SimilarWeb API to pull traffic and engagement insights for the domain. Data includes global rank, country rank, category rank, total visits, bounce rate, and more. Extract Key Metrics Processes raw SimilarWeb data into a simplified structure. Extracted insights include: Ranks: Global, Country, and Category. Traffic Overview: Total Visits, Bounce Rate, Pages per Visit, Avg Visit Duration. Top Traffic Sources: Direct, Search, Social. Top Countries (Top 3): With traffic share percentages. Device Split: Mobile vs Desktop. Update Google Sheet Writes the cleaned and enriched domain data back into the same (or another) Google Sheet. Ensures each row is updated with the new traffic insights. Export to Airtable (Optional) Creates a new record in Airtable with the enriched traffic metrics. Useful if you want to manage or visualize company/domain data outside of Google Sheets. Why use this? Automatically enriches domain lists with live traffic data from SimilarWeb. Cleans messy URLs into a standard format. Saves hours of manual research on company traffic insights. Provides structured, comparable metrics for better decision-making. Flexible: update sheets, export to Airtable, or both.
by Welat Eren
What this workflow does This workflow turns your Spotify listening history into vocabulary flashcards for language learning. When you listen to music in your target language, you hear words hundreds of times without trying, they're already in your subconscious. This workflow extracts those words and turns them into flashcards, so you're connecting meaning to sounds you already know. Every Sunday, the workflow: Fetches your recently played songs from Spotify Finds lyrics via lrclib.net Extracts 40-60 useful words (B1-B2 level) using Google Gemini Deduplicates against all previously learned words Writes new vocabulary to Google Sheets (master + weekly tab) Review with the free Flashcard Lab app (iOS + Android) which reads directly from Google Sheets with spaced repetition. Works for any language, just change the AI prompt. Setup Setup time: ~15 minutes Google Cloud Console: Create project, enable Sheets API + Drive API, create OAuth2 credentials, set app to "In Production" (tokens expire after 7 days in Testing mode) Gemini API: Get a free key from Google AI Studio Spotify Developer Dashboard: Create app, note Client ID + Secret Google Sheet: Create a sheet with tab "All Vocabularies" and headers Word + Translation Import workflow, connect credentials, select your Google Sheet in all Sheets nodes Click "Execute Workflow" to test Enable the schedule trigger for weekly runs Changing the language Edit the prompt in "Prepare all Lyrics into Pairs" โ that's the only place you need to change. All other nodes use generic Word and Translation columns. > ๐ง Tip: Listen to music in the language you're learning. The whole point is that your brain already absorbed these words passively โ the flashcards connect meaning to sounds you already know. Full documentation ๐ GitHub Repository
by Fei Wu
Reddit Post Saver & Summarizer with AI-Powered Filtering Who This Is For Perfect for content curators, researchers, developers, and community managers who want to build a structured database of valuable Reddit content without manual data entry. If you're tracking industry trends, gathering user feedback, or building a knowledge base from Reddit discussions, this workflow automates the entire process. The Problem It Solves Reddit has incredible discussions, but manually copying posts, extracting insights, and organizing them into a database is time-consuming. This workflow automatically transforms your saved Reddit posts into structured, searchable dataโcomplete with AI-generated summaries of both the post and its comment section. How It Works 1. Save Posts Manually Simply use Reddit's built-in save feature on any post you find valuable. 2. Automated Daily Processing The workflow triggers once per day and: Fetches all your saved Reddit posts via Reddit API Filters posts by subreddit and custom conditions (e.g., "only posts about JavaScript frameworks" or "posts with more than 100 upvotes") Uses an LLM (Google Gemini) to verify posts match your natural language criteria Generates comprehensive summaries of both the original post and top comments 3. Structured Database Storage Filtered and summarized posts are automatically saved to your Supabase database with this structure: { "reddit_id": "unique post identifier", "title": "post title", "url": "direct link to Reddit post", "summary": "AI-generated summary of post and comments", "tags": ["array", "of", "relevant", "tags"], "post_date": "original post creation date", "upvotes": "number of upvotes", "num_comments": "total comment count" } Setup Requirements Reddit API credentials** (client ID and secret) Supabase account** with a database table Google Gemini API key** (or alternative LLM provider) Basic configuration of filter conditions (subreddit names and natural language criteria) Use Cases Product Research**: Track competitor mentions and feature requests Content Creation**: Build a library of trending topics in your niche Community Management**: Monitor feedback across multiple subreddits Academic Research**: Collect and analyze discussions on specific topics
by vinci-king-01
Job Posting Aggregator with Email and GitHub โ ๏ธ COMMUNITY TEMPLATE DISCLAIMER: This is a community-contributed template that uses ScrapeGraphAI (a community node). Please ensure you have the ScrapeGraphAI community node installed in your n8n instance before using this template. This workflow automatically aggregates certification-related job-posting requirements from multiple industry sources, compares them against last yearโs data stored in GitHub, and emails a concise change log to subscribed professionals. It streamlines annual requirement checks and renewal reminders, ensuring users never miss an update. Pre-conditions/Requirements Prerequisites n8n instance (self-hosted or n8n cloud) ScrapeGraphAI community node installed Git installed (for optional local testing of the repo) Working SMTP server or other Email credential supported by n8n Required Credentials ScrapeGraphAI API Key** โ Enables web scraping of certification pages GitHub Personal Access Token** โ Allows the workflow to read/write files in the repo Email / SMTP Credentials** โ Sends the summary email to end-users Specific Setup Requirements | Resource | Purpose | Example | |----------|---------|---------| | GitHub Repository | Stores certification_requirements.json versioned annually | https://github.com/<you>/cert-requirements.git | | Watch List File | List of page URLs & selectors to scrape | Saved in the repo under /config/watchList.json | | Email List | Semicolon-separated list of recipients | me@company.com;team@company.com | How it works This workflow automatically aggregates certification-related job-posting requirements from multiple industry sources, compares them against last yearโs data stored in GitHub, and emails a concise change log to subscribed professionals. It streamlines annual requirement checks and renewal reminders, ensuring users never miss an update. Key Steps: Manual Trigger**: Starts the workflow on demand or via scheduled cron. Load Watch List (Code Node)**: Reads the list of certification URLs and CSS selectors. Split In Batches**: Iterates through each URL to avoid rate limits. ScrapeGraphAI**: Scrapes requirement details from each page. Merge (Wait)**: Reassembles individual scrape results into a single JSON array. GitHub (Read File)**: Retrieves last yearโs certification_requirements.json. IF (Change Detector)**: Compares current vs. previous JSON and decides whether changes exist. Email Send**: Composes and sends a formatted summary of changes. GitHub (Upsert File)**: Commits the new JSON file back to the repo for future comparisons. Set up steps Setup Time: 15-25 minutes Install Community Node: From n8n UI โ Settings โ Community Nodes โ search and install โScrapeGraphAIโ. Create/Clone GitHub Repo: Add an empty certification_requirements.json ( {} ) and a config/watchList.json with an array of objects like: [ { "url": "https://cert-body.org/requirements", "selector": "#requirements" } ] Generate GitHub PAT: Scope repo, store in n8n Credentials as โGitHub APIโ. Add ScrapeGraphAI Credential: Paste your API key into n8n Credentials. Configure Email Credentials: E.g., SMTP with username/password or OAuth2. Open Workflow: Import the template JSON into n8n. Update Environment Variables (in the Code node or via n8n variables): GITHUB_REPO (e.g., user/cert-requirements) EMAIL_RECIPIENTS Test Run: Trigger manually. Verify email content and GitHub commit. Schedule: Add a Cron node (optional) for yearly or quarterly automatic runs. Node Descriptions Core Workflow Nodes: Manual Trigger** โ Initiates the workflow manually or via external schedule. Code (Load Watch List)** โ Reads and parses watchList.json from GitHub or static input. SplitInBatches** โ Controls request concurrency to avoid scraping bans. ScrapeGraphAI** โ Extracts requirement text using provided CSS selectors or XPath. Merge (Combine)** โ Waits for all batches and merges them into one dataset. GitHub (Read/Write File)** โ Handles version-controlled storage of JSON data. IF (Change Detector)** โ Compares hashes/JSON diff to detect updates. EmailSend** โ Sends change log, including renewal reminders and diff summary. Sticky Note** โ Provides in-workflow documentation for future editors. Data Flow: Manual Trigger โ Code (Load Watch List) โ SplitInBatches SplitInBatches โ ScrapeGraphAI โ Merge Merge โ GitHub (Read File) โ IF (Change Detector) IF (True) โ Email Send โ GitHub (Upsert File) Customization Examples Adjusting Scraper Configuration // Inside the Watch List JSON object { "url": "https://new-association.com/cert-update", "selector": ".content article:nth-of-type(1) ul" } Custom Email Template // In Email Send node โ HTML Content ๐ Certification Updates โ {{ $json.date }} The following certifications have new requirements: {{ $json.diffHtml }} For full details visit our GitHub repo. Data Output Format The workflow outputs structured JSON data: { "timestamp": "2024-09-01T12:00:00Z", "source": "watchList.json", "current": { "AWS-SAA": "Version 3.0, requires renewed proctored exam", "PMP": "60 PDUs every 3 years" }, "previous": { "AWS-SAA": "Version 2.0", "PMP": "60 PDUs every 3 years" }, "changes": { "AWS-SAA": "Updated to Version 3.0; exam format changed." } } Troubleshooting Common Issues ScrapeGraphAI returns empty data โ Check CSS/XPath selectors and ensure page is publicly accessible. GitHub authentication fails โ Verify PAT scope includes repo and that the credential is linked in both GitHub nodes. Performance Tips Limit SplitInBatches size to 3-5 URLs when sources are heavy to avoid timeouts. Enable n8n execution mode โQueueโ for long-running scrapes. Pro Tips: Store selector samples in comments next to each watch list entry for future maintenance. Use a Cron node set to โ0 0 1 1 *โ for an annual run exactly on Jan 1st. Add a Telegram node after Email Send for instant mobile notifications.
by vinci-king-01
Certification Requirement Tracker with Rocket.Chat and GitLab โ ๏ธ COMMUNITY TEMPLATE DISCLAIMER: This is a community-contributed template that uses ScrapeGraphAI (a community node). Please ensure you have the ScrapeGraphAI community node installed in your n8n instance before using this template. This workflow automatically monitors websites of certification bodies and industry associations, detects changes in certification requirements, commits the updated information to a GitLab repository, and notifies a Rocket.Chat channel. Ideal for professionals and compliance teams who must stay ahead of annual updates and renewal deadlines. Pre-conditions/Requirements Prerequisites Running n8n instance (self-hosted or n8n.cloud) ScrapeGraphAI community node installed and active Rocket.Chat workspace (self-hosted or cloud) GitLab account and repository for documentation Publicly reachable URL for incoming webhooks (use n8n tunnel, Ngrok, or a reverse proxy) Required Credentials ScrapeGraphAI API Key** โ Enables scraping of certification pages Rocket.Chat Access Token & Server URL** โ To post update messages GitLab Personal Access Token** โ With api and write_repository scopes Specific Setup Requirements | Item | Example Value | Notes | | ------------------------------ | ------------------------------------------ | ----- | | GitLab Repo | gitlab.com/company/cert-tracker | Markdown files will be committed here | | Rocket.Chat Channel | #certification-updates | Receives update alerts | | Certification Source URLs file | /data/sourceList.json in the repository | List of URLs to scrape | How it works This workflow automatically monitors websites of certification bodies and industry associations, detects changes in certification requirements, commits the updated information to a GitLab repository, and notifies a Rocket.Chat channel. Ideal for professionals and compliance teams who must stay ahead of annual updates and renewal deadlines. Key Steps: Webhook Trigger**: Fires on a scheduled HTTP call (e.g., via cron) or manual trigger. Code (Prepare Source List)**: Reads/constructs a list of certification URLs to scrape. ScrapeGraphAI**: Fetches HTML content and extracts requirement sections. Merge**: Combines newly scraped data with the last committed snapshot. IF Node**: Determines if a change occurred (hash/length comparison). GitLab**: Creates a branch, commits updated Markdown/JSON files, and opens an MR (optional). Rocket.Chat**: Posts a message summarizing changes and linking to the GitLab diff. Respond to Webhook**: Returns a JSON summary to the requester (useful for monitoring or chained automations). Set up steps Setup Time: 20-30 minutes Install Community Node: In n8n UI, go to Settings โ Community Nodes and install @n8n/community-node-scrapegraphai. Create Credentials: a. ScrapeGraphAI โ paste your API key. b. Rocket.Chat โ create a personal access token (Personal Access Tokens โ New Token) and configure credentials. c. GitLab โ create PAT with api + write_repository scopes and add to n8n. Clone the Template: Import this workflow JSON into your n8n instance. Edit StickyNote: Replace placeholder URLs with actual certification-source URLs or point to a repo file. Configure GitLab Node: Set your repository, default branch, and commit message template. Configure Rocket.Chat Node: Select credential, channel, and message template (markdown supported). Expose Webhook: If self-hosting, enable n8n tunnel or configure reverse proxy to make the webhook public. Test Run: Trigger the workflow manually; verify GitLab commit/MR and Rocket.Chat notification. Automate: Schedule an external cron (or n8n Cron node) to POST to the webhook yearly, quarterly, or monthly as needed. Node Descriptions Core Workflow Nodes: stickyNote** โ Human-readable instructions/documentation embedded in the flow. webhook** โ Entry point; accepts POST /cert-tracker requests. code (Prepare Source List)** โ Generates an array of URLs; can pull from GitLab or an environment variable. scrapegraphAi** โ Scrapes each URL and extracts certification requirement sections using CSS/XPath selectors. merge (by key)** โ Joins new data with previous snapshot for change detection. if (Changes?)** โ Branches logic based on whether differences exist. gitlab** โ Creates/updates files and opens merge requests containing new requirements. rocketchat** โ Sends formatted update to designated channel. respondToWebhook** โ Returns 200 OK with a JSON summary. Data Flow: webhook โ code โ scrapegraphAi โ merge โ if if (true) โ gitlab โ rocketchat if (false) โ respondToWebhook Customization Examples Change Scraping Frequency // Replace external cron with n8n Cron node { "nodes": [ { "name": "Cron", "type": "n8n-nodes-base.cron", "parameters": { "schedule": { "hour": "0", "minute": "0", "dayOfMonth": "1" } } } ] } Extend Notification Message // Rocket.Chat node โ Message field const diffUrl = $json["gitlab_diff_url"]; const count = $json["changes_count"]; return :bell: ${count} Certification Requirement Update(s)\n\nView diff: ${diffUrl}; Data Output Format The workflow outputs structured JSON data: { "timestamp": "2024-05-15T12:00:00Z", "changesDetected": true, "changesCount": 3, "gitlab_commit_sha": "a1b2c3d4", "gitlab_diff_url": "https://gitlab.com/company/cert-tracker/-/merge_requests/42", "notifiedChannel": "#certification-updates" } Troubleshooting Common Issues ScrapeGraphAI returns empty results โ Verify your CSS/XPath selectors and API key quota. GitLab commit fails (401 Unauthorized) โ Ensure PAT has api and write_repository scopes and is not expired. Performance Tips Limit the number of pages scraped per run to avoid API rate limits. Cache last-scraped HTML in an S3 bucket or database to reduce redundant requests. Pro Tips: Use GitLab CI to auto-deploy documentation site whenever new certification files are merged. Enable Rocket.Chat threading to keep discussions organized per update. Tag stakeholders in Rocket.Chat messages with @cert-team for instant visibility.
by Onur
๐ Extract Zillow Property Data to Google Sheets with Scrape.do This template requires a self-hosted n8n instance to run. A complete n8n automation that extracts property listing data from Zillow URLs using Scrape.do web scraping API, parses key property information, and saves structured results into Google Sheets for real estate analysis, market research, and property tracking. ๐ Overview This workflow provides a lightweight real estate data extraction solution that pulls property details from Zillow listings and organizes them into a structured spreadsheet. Ideal for real estate professionals, investors, market analysts, and property managers who need automated property data collection without manual effort. Who is this for? Real estate investors tracking properties Market analysts conducting property research Real estate agents monitoring listings Property managers organizing data Data analysts building real estate databases What problem does this workflow solve? Eliminates manual copy-paste from Zillow Processes multiple property URLs in bulk Extracts structured data (price, address, zestimate, etc.) Automates saving results into Google Sheets Ensures repeatable & consistent data collection โ๏ธ What this workflow does Manual Trigger โ Starts the workflow manually Read Zillow URLs from Google Sheets โ Reads property URLs from a Google Sheet Scrape Zillow URL via Scrape.do โ Fetches full HTML from Zillow (bypasses PerimeterX protection) Parse Zillow Data โ Extracts structured property information from HTML Write Results to Google Sheets โ Saves parsed data into a results sheet ๐ Output Data Points | Field | Description | Example | |-------|-------------|---------| | URL | Original Zillow listing URL | https://www.zillow.com/homedetails/... | | Price | Property listing price | $300,000 | | Address | Street address | 8926 Silver City | | City | City name | San Antonio | | State | State abbreviation | TX | | Days on Zillow | How long listed | 5 | | Zestimate | Zillow's estimated value | $297,800 | | Scraped At | Timestamp of extraction | 2025-01-29T12:00:00.000Z | โ๏ธ Setup Prerequisites n8n instance (self-hosted) Google account with Sheets access Scrape.do account with API token (Get 1000 free credits/month) Google Sheet Structure This workflow uses one Google Sheet with two tabs: Input Tab: "Sheet1" | Column | Type | Description | Example | |--------|------|-------------|---------| | URLs | URL | Zillow listing URL | https://www.zillow.com/homedetails/123... | Output Tab: "Results" | Column | Type | Description | Example | |--------|------|-------------|---------| | URL | URL | Original listing URL | https://www.zillow.com/homedetails/... | | Price | Text | Property price | $300,000 | | Address | Text | Street address | 8926 Silver City | | City | Text | City name | San Antonio | | State | Text | State code | TX | | Days on Zillow | Number | Days listed | 5 | | Zestimate | Text | Estimated value | $297,800 | | Scraped At | Timestamp | When scraped | 2025-01-29T12:00:00.000Z | ๐ Step-by-Step Setup Import Workflow: Copy the JSON โ n8n โ Workflows โ + Add โ Import from JSON Configure Scrape.do API: Sign up at Scrape.do Dashboard Get your API token In HTTP Request node, replace YOUR_SCRAPE_DO_TOKEN with your actual token The workflow uses super=true for premium residential proxies (10 credits per request) Configure Google Sheets: Create a new Google Sheet Add two tabs: "Sheet1" (input) and "Results" (output) In Sheet1, add header "URLs" in cell A1 Add Zillow URLs starting from A2 Set up Google Sheets OAuth2 credentials in n8n Replace YOUR_SPREADSHEET_ID with your actual Google Sheet ID Replace YOUR_GOOGLE_SHEETS_CREDENTIAL_ID with your credential ID Run & Test: Add 1-2 test Zillow URLs in Sheet1 Click "Execute workflow" Check results in Results tab ๐งฐ How to Customize Add more fields**: Extend parsing logic in "Parse Zillow Data" node to capture additional data (bedrooms, bathrooms, square footage) Filtering**: Add conditions to skip certain properties or price ranges Rate Limiting**: Insert a Wait node between requests if processing many URLs Error Handling**: Add error branches to handle failed scrapes gracefully Scheduling**: Replace Manual Trigger with Schedule Trigger for automated daily/weekly runs ๐ Use Cases Investment Analysis**: Track property prices and zestimates over time Market Research**: Analyze listing trends in specific neighborhoods Portfolio Management**: Monitor properties for sale in target areas Competitive Analysis**: Compare similar properties across locations Lead Generation**: Build databases of properties matching specific criteria ๐ Performance & Limits Single Property**: ~5-10 seconds per URL Batch of 10**: 1-2 minutes typical Large Sets (50+)**: 5-10 minutes depending on Scrape.do credits API Calls**: 1 Scrape.do request per URL (10 credits with super=true) Reliability**: 95%+ success rate with premium proxies ๐งฉ Troubleshooting | Problem | Solution | |---------|----------| | API error 400 | Check your Scrape.do token and credits | | URL showing "undefined" | Verify Google Sheet column name is "URLs" (capital U) | | No data parsed | Check if Zillow changed their HTML structure | | Permission denied | Re-authenticate Google Sheets OAuth2 in n8n | | 50000 character error | Verify Parse Zillow Data code is extracting fields, not returning raw HTML | | Price shows HTML/CSS | Update price extraction regex in Parse Zillow Data node | ๐ค Support & Community Scrape.do Documentation Scrape.do Dashboard Scrape.do Zillow Scraping Guide n8n Forum n8n Docs ๐ฏ Final Notes This workflow provides a repeatable foundation for extracting Zillow property data with Scrape.do and saving to Google Sheets. You can extend it with: Historical tracking (append timestamps) Price change alerts (compare with previous scrapes) Multi-platform scraping (Redfin, Realtor.com) Integration with CRM or reporting dashboards Important: Scrape.do handles all anti-bot bypassing (PerimeterX, CAPTCHAs) automatically with rotating residential proxies, so you only pay for successful requests. Always use super=true parameter for Zillow to ensure high success rates.
by Intuz
This n8n template from Intuz provides a complete solution to automate on-demand lead generation. It acts as a powerful scraping agent that takes a simple chat query, scours both Google Search and Google Maps for relevant businesses, scrapes their websites for contact details, and compiles an enriched lead list directly in Google Sheets. Who's this workflow for? Sales Development Representatives (SDRs) Local Marketing Agencies Business Development Teams Freelancers & Consultants Market Researchers How it works 1. Start with a Chat Query: The user initiates the workflow by typing a search query (e.g., "dentists in New York") into a chat interface. 2. Multi-Source Search: The workflow queries both the Google Custom Search API (for web results across multiple pages) and scrapes Google Maps (for local businesses) to gather a broad list of potential leads. 3. Deep Dive Website Scraping: For each unique business website found, the workflow visits the URL to scrape the raw HTML content of the page. 4. Intelligent Contact Extraction: Using custom code, it then parses the scraped website content to find and extract valuable contact information like email addresses, phone numbers, and social media links. 5. Deduplicate and Log to Sheets: Before saving, the workflow checks your Google Sheet to ensure the lead doesn't already exist. All unique, newly enriched leads are then appended as clean rows to your sheet, along with the original search query for tracking. Key Requirements to Use This Template 1. n8n Instance & Required Nodes: An active n8n account (Cloud or self-hosted). This workflow uses the official n8n LangChain integration (@n8n/n8n-nodes-langchain) for the chat trigger. If you are using a self-hosted version of n8n, please ensure this package is installed. 2. Google Custom Search API: A Google Cloud Project with the "Custom Search API" enabled. You will need an API Key for this service. You must also create a Programmable Search Engine and get its Search engine ID (cx). This tells Google what to search (e.g., the whole web). 3. Google Sheets Account: A Google account and a pre-made Google Sheet with columns for Business Name, Primary Email, Contact Number, URL, Description, Socials, and Search Query. Setup Instructions 1. Configure the Chat Trigger: In the "When chat message received" node, you can find the Direct URL or Embed code to use the chat interface. Set Up Google Custom Search API (Crucial Step): Go to the "Custom Google Search API" (HTTP Request) node. Under "Query Parameters", you must replace the placeholder values for key (with your API Key) and cx (with your Search Engine ID). 3. Configure Google Sheets: In all Google Sheets nodes (Append row in sheet, Get row(s) in sheet, etc.), connect your Google Sheets credentials. Select your target spreadsheet (Document ID) and the specific sheet (Sheet Name) where you want to store the leads. 4. Activate the Workflow: Save the workflow and toggle the "Active" switch to ON. Open the chat URL and enter a search query to start generating leads. Connect with us Website: https://www.intuz.com/services Email: getstarted@intuz.com LinkedIn: https://www.linkedin.com/company/intuz Get Started: https://n8n.partnerlinks.io/intuz For Custom Workflow Automation Click here- Get Started
by Onur
๐ Extract Competitor SERP Rankings from Google Search to Sheets with Scrape.do This template requires a self-hosted n8n instance to run. A complete n8n automation that extracts competitor data from Google search results for specific keywords and target countries using Scrape.do SERP API, and saves structured results into Google Sheets for SEO, competitive analysis, and market research. ๐ Overview This workflow provides a lightweight competitor analysis solution that identifies ranking websites for chosen keywords across different countries. Ideal for SEO specialists, content strategists, and digital marketers who need structured SERP insights without manual effort. Who is this for? SEO professionals tracking keyword competitors Digital marketers conducting market analysis Content strategists planning based on SERP insights Business analysts researching competitor positioning Agencies automating SEO reporting What problem does this workflow solve? Eliminates manual SERP scraping Processes multiple keywords across countries Extracts structured data (position, title, URL, description) Automates saving results into Google Sheets Ensures repeatable & consistent methodology โ๏ธ What this workflow does Manual Trigger โ Starts the workflow manually Get Keywords from Sheet โ Reads keywords + target countries from a Google Sheet URL Encode Keywords โ Converts keywords into URL-safe format Process Keywords in Batches โ Handles multiple keywords sequentially to avoid rate limits Fetch Google Search Results โ Calls Scrape.do SERP API to retrieve raw HTML of Google SERPs Extract Competitor Data from HTML โ Parses HTML into structured competitor data (top 10 results) Append Results to Sheet โ Writes structured SERP results into a Google Sheet ๐ Output Data Points | Field | Description | Example | |--------------------|------------------------------------------|-------------------------------------------| | Keyword | Original search term | digital marketing services | | Target Country | 2-letter ISO code of target region | US | | position | Ranking position in search results | 1 | | websiteTitle | Page title from SERP result | Digital Marketing Software & Tools | | websiteUrl | Extracted website URL | https://www.hubspot.com/marketing | | websiteDescription | Snippet/description from search results | Grow your business with HubSpotโs toolsโฆ | โ๏ธ Setup Prerequisites n8n instance (self-hosted) Google account with Sheets access Scrape.do* account with *SERP API token** Google Sheet Structure This workflow uses one Google Sheet with two tabs: Input Tab: "Keywords" | Column | Type | Description | Example | |----------|------|-------------|---------| | Keyword | Text | Search query | digital marketing | | Target Country | Text | 2-letter ISO code | US | Output Tab: "Results" | Column | Type | Description | Example | |--------------------|-------|-------------|---------| | Keyword | Text | Original search term | digital marketing | | position | Number| SERP ranking | 1 | | websiteTitle | Text | Title of the page | Digital Marketing Software & Tools | | websiteUrl | URL | Website/page URL | https://www.hubspot.com/marketing | | websiteDescription | Text | Snippet text | Grow your business with HubSpotโs tools | ๐ Step-by-Step Setup Import Workflow: Copy the JSON โ n8n โ Workflows โ + Add โ Import from JSON Configure **Scrape.do API**: Endpoint: https://api.scrape.do/ Parameter: token=YOUR_SCRAPEDO_TOKEN Add render=true for full HTML rendering Configure Google Sheets: Create a sheet with two tabs: Keywords (input), Results (output) Set up Google Sheets OAuth2 credentials in n8n Replace placeholders: YOUR_GOOGLE_SHEET_ID and YOUR_GOOGLE_SHEETS_CREDENTIAL_ID Run & Test: Add test data in Keywords tab Execute workflow โ Check results in Results tab ๐งฐ How to Customize Add more fields**: Extend HTML parsing logic in the โExtract Competitor Dataโ node to capture extra data (e.g., domain, sitelinks). Filtering**: Exclude domains or results with custom rules. Batch Size**: Adjust โProcess Keywords in Batchesโ for speed vs. rate-limits. Rate Limiting: Insert a **Wait node (e.g., 10โ30 seconds) if API rate limits apply. Multi-Sheet Output**: Save per-country or per-keyword results into separate tabs. ๐ Use Cases SEO Competitor Analysis**: Identify top-ranking sites for target keywords Market Research**: See how SERPs differ by region Content Strategy**: Analyze titles & descriptions of competitor pages Agency Reporting**: Automate competitor SERP snapshots for clients ๐ Performance & Limits Single Keyword: ~10โ20 seconds (depends on **Scrape.do response) Batch of 10**: 3โ5 minutes typical Large Sets (50+)**: 20โ40 minutes depending on API credits & batching API Calls: 1 **Scrape.do request per keyword Reliability**: 95%+ extraction success, 98%+ data accuracy ๐งฉ Troubleshooting API error** โ Check YOUR_SCRAPEDO_TOKEN and API credits No keywords loaded** โ Verify Google Sheet ID & tab name = Keywords Permission denied** โ Re-authenticate Google Sheets OAuth2 in n8n Empty results** โ Check parsing logic and verify search term validity Workflow stops early** โ Ensure batching loop (SplitInBatches) is properly connected ๐ค Support & Community n8n Forum: https://community.n8n.io n8n Docs: https://docs.n8n.io Scrape.do Dashboard: https://dashboard.scrape.do ๐ฏ Final Notes This workflow provides a repeatable foundation for extracting competitor SERP rankings with Scrape.do and saving them to Google Sheets. You can extend it with filtering, richer parsing, or integration with reporting dashboards to create a fully automated SEO intelligence pipeline.
by Alejandro Scuncia
An extendable RAG template to build powerful, explainable AI assistants โ with query understanding, semantic metadata, and support for free-tier tools like Gemini, Gemma and Supabase. Description This workflow helps you build smart, production-ready RAG agents that go far beyond basic document Q&A. It includes: โ File ingestion and chunking โ Asynchronous LLM-powered enrichment โ Filterable metadata-based search โ Gemma-based query understanding and generation โ Cohere re-ranking โ Memory persistence via Postgres Everything is modular, low-cost, and designed to run even with free-tier LLMs and vector databases. Whether you want to build a chatbot, internal knowledge assistant, documentation search engine, or a filtered content explorer โ this is your foundation. โ๏ธ How It Works This workflow is divided into 3 pipelines: ๐ฅ Ingestion Upload a PDF via form Extract text and chunk it for embedding Store in Supabase vector store using Google Gemini embeddings ๐ง Enrichment (Async) Scheduled task fetches new chunks Each chunk is enriched with LLM metadata (topics, use_case, risks, audience level, summary, etc.) Metadata is added to the vector DB for improved retrieval and filtering ๐ค Agent Chat A user question triggers the RAG agent Query Builder transforms it into keywords and filters Vector DB is queried and reranked The final answer is generated using only retrieved evidence, with references Chat memory is managed via Postgres ๐ Key Features Asynchronous enrichment** โ Save tokens, batch process with free-tier LLMs like Gemma Metadata-aware** โ Improved filtering and reranking Explainable answers** โ Agent cites sources and sections Chat memory** โ Persistent context with Postgres Modular design** โ Swap LLMs, rerankers, vector DBs, and even enrichment schema Free to run** โ Built with Gemini, Gemma, Cohere, Supabase (free tier-compatible) ๐ Required Credentials |Tool|Use| |-|-|-| |Supabase w/ PostreSQL|Vector DB + storage| |Google Gemini/Gemma|Embeddings & LLM| |Cohere API|Re-ranking| |PostgreSQL|Chat memory| ๐งฐ Customization Tips Swap extractFromFile with Notion/Google Drive integrations Extend Metadata Obtention prompt to fit your domain (e.g., financial, legal) Replace LLMs with OpenAI, Mistral, or Ollama Replace Postgre Chat Memory with Simple Memory or any other Use a webhook instead of a form to automate ingestion Connect to Telegram/Slack UI with a few extra nodes ๐ก Use Cases Company knowledge base bot (internal docs, SOPs) Educational assistant with smart filtering (by topic or level) Legal or policy assistant that cites source sections Product documentation Q&A with multi-language support Training material assistant that highlights risks/examples Content Generation ๐ง Who Itโs For Indie developers building smart chatbots AI consultants prototyping Q&A assistants Teams looking for an internal knowledge agent Anyone building affordable, explainable AI tools ๐ Try It Out! Deploy a modular RAG assistant using n8n, Supabase, and Gemini โ fully customizable and almost free to run. 1. ๐ Prepare Your PDFs Use any internal documents, manuals, or reports in *PDF *format. Optional: Add Google Drive integration to automate ingestion. 2. ๐งฉ Set Up Supabase Create a free Supabase project Use the table creation queries included in the workflow to set up your schema. Add your *supabaseUrl *and *supabaseKey *in your n8n credentials. > ๐ก Pro Tip: Make sure you match the embedding dimensions to your model. This workflow uses Gemini text-embedding-04 (768-dim) โ if switching to OpenAI, change your table vector size to 1536. 3. ๐ง Connect Gemini & Gemma Use Gemini/Gemma for embeddings and optional metadata enrichment. Or deploy locally for lightweight async LLM processing (via Ollama/HuggingFace). 4. โ๏ธ Import the Workflow in n8n Open n8n (self-hosted or cloud). Import the workflow file and paste your credentials. Youโre ready to ingest, enrich, and query your document base. ๐ฌ Have Feedback or Ideas? Iโd Love to Hear This project is open, modular, and evolving โ just like great workflows should be :). If youโve tried it, built on top of it, or have suggestions for improvement, Iโd genuinely love to hear from you. Letโs share ideas, collaborate, or just connect as part of the n8n builder community. ๐ง ascuncia.es@gmail.com ๐ Linkedin
by Ian Kerins
Overview This n8n template automates the process of scraping job listings from Indeed, parsing the data into a structured format, and saving it to Google Sheets for easy tracking. It also includes a Slack notification system to alert you when new jobs are found. Built with ScrapeOps, it handles the complexities of web scraping - such as proxy rotation, anti-bot bypassing, and HTML parsing - so you can focus on the data. Who is this for? Job Seekers**: Automate your daily job search and get instant alerts for new postings. Recruiters & HR Agencies**: Track hiring trends and find new leads for candidate placement. Sales & Marketing Teams**: Monitor companies that are hiring to identify growth signals and lead opportunities. Data Analysts**: Gather labor market data for research and competitive analysis. What problems it solves Manual Searching**: Eliminates the need to manually refresh Indeed and copy-paste job details. Data Structure**: Converts messy HTML into clean, organized rows in a spreadsheet. Blocking & Captchas**: Uses ScrapeOps residential proxies to bypass Indeed's anti-bot protections reliably. Missed Opportunities**: Automated scheduling ensures you are the first to know about new listings. How it works Trigger: The workflow runs on a schedule (default: every 6 hours). Configuration: You define your search query (e.g., "Software Engineer") in the Set Search URL node. Scraping: The ScrapeOps Proxy API fetches the Indeed search results page using a residential proxy to avoid detection. Parsing: The ScrapeOps Parser API takes the raw HTML and extracts key details like Job Title, Company, Location, Salary, and URL. Filtering: A code node filters out invalid results and structures the data. Storage: Valid jobs are appended to a Google Sheet. Notification: A message is sent to Slack confirming the update. Setup steps (~ 10-15 minutes) ScrapeOps Account: Register for a free ScrapeOps API Key. In n8n, open the ScrapeOps nodes and create a new credential with your API key. Google Sheets: Duplicate this Google Sheet Template. Open the Save to Google Sheets node. Connect your Google account and select your duplicated sheet. Slack Setup: Open the Send a message node. Connect your Slack account and select the channel where you want to receive alerts. Customize Search: Open the Set Search URL node. Update the search_query value to the job title or keyword you want to track. Pre-conditions An active ScrapeOps account (Free tier available). A Google Cloud account with Google Sheets API enabled (for n8n connection). A Slack workspace for notifications. Disclaimer This template uses ScrapeOps as a community node. You are responsible for complying with Indeed's Terms of Use, robots directives, and applicable laws in your jurisdiction. Scraping targets may change at any time; adjust render/scroll/wait settings and parsers as needed. Use responsibly for legitimate business purposes. Resources ScrapeOps n8n Overview ScrapeOps Proxy API Documentation ScrapeOps Parser API Documentation
by Harvex AI
AI Lead Enrichment & Notification System This n8n template automates the lead enrichment process for your business. Once a lead fills out a form, the workflow scrapes their website, provides a summary of their business, and logs everything into a CRM before notifying your team on Slack. Some use cases: "Speed-to-Lead" optimization, lead enrichment, automated prospect research. How it works Ingestion: A lead submits their details (Name, Email, Website) via a form. Intelligent scraping: The workflow scrapes the provided URL. AI Analysis: OpenAI's model (GPT-4o) analyzes the extracted data and determines whether there is enough info or if the workflow needs to scrape the "About Us" page. CRM Sync: The CRM (Airtable) is updated with the enriched data. Notification: An instant Slack notification is sent to the team channel. How to use the workflow Configure the form: Open the trigger form and input the required fields. Setup OpenAI: Ensure that your credentials are connected. Database mapping: Ensure your Airtable base has the following columns: Name, Website, AI Insight, Email, and Date. Slack setup: Specify the desired Slack channel for notifications. Test it out! Open the form, enter sample data (with a real website), and watch the system enrich the lead for you. Requirements OpenAI API Key** (For analyzing website content and generating summaries) Airtable** (For CRM and logging) Slack** (For team notifications)
by Madame AI
Generate audio documentaries from web articles to Telegram with ElevenLabs & BrowserAct This workflow transforms any web article or blog post into a high-production-value audio documentary. It automates the entire production chainโfrom scraping content and writing an engaging narrative script to generating realistic voiceoversโdelivering a listenable MP3 file directly to your Telegram chat. Target Audience Commuters, podcast enthusiasts, content creators, and researchers who prefer listening to content over reading. How it works Analyze Intent: The workflow receives a message via Telegram. An AI Agent (using Google Gemini) classifies the input to determine if it is a casual chat or a request to process a URL. Scrape Content: If a valid link is detected, BrowserAct executes a background task to visit the webpage and extract the raw text. Write Script: A Scriptwriter Agent (using Claude via OpenRouter) converts the dry article text into a dramatic, narrative-driven script optimized for audio, including cues for pacing and tone. Generate Audio: ElevenLabs synthesizes the script into high-fidelity speech using a specific voice model (e.g., "Liam"). Deliver Output: The workflow sends the generated MP3 file and a formatted HTML summary caption back to the user on Telegram. How to set up Configure Credentials: Connect your Telegram, ElevenLabs, OpenRouter, Google Gemini, and BrowserAct accounts in n8n. Prepare BrowserAct: Ensure the AI Summarization & Eleven Labs Podcast Generation template is saved in your BrowserAct account. Select Voice: Open the Convert text to speech node and select your preferred ElevenLabs voice model. Configure Model: Open the OpenRouter node to confirm the model selection (e.g., Claude Haiku) or switch to a different LLM for scriptwriting. Activate: Turn on the workflow and send a link to your Telegram bot to test it. Requirements BrowserAct* account with the *AI Summarization & Eleven Labs Podcast Generation** template. ElevenLabs** account. OpenRouter** account (or access to an LLM like Claude). Google Gemini** account. Telegram** account (Bot Token). How to customize the workflow Change the Persona: Modify the system prompt in the Scriptwriter node to change the narrative style (e.g., from "Documentary Host" to "Comedian" or "News Anchor"). Switch Output Channel: Replace the Telegram output node with a Google Drive or Dropbox node to archive the generated audio files for a podcast feed. Multi-Voice Support: Add logic to split the script into multiple parts and use different ElevenLabs voices to simulate a conversation between two hosts. Need Help? How to Find Your BrowserAct API Key & Workflow ID How to Connect n8n to BrowserAct How to Use & Customize BrowserAct Templates Workflow Guidance and Showcase Video How to Build an AI Podcast Generator: n8n, BrowserAct & Eleven Labs