Extract Clean Web Content with Anti-Bot Fallback for AI Agents & Workflows
This workflow contains community nodes that are only compatible with the self-hosted version of n8n.
Clean Web Content Extraction with Anti-Bot Fallback Extract clean and structured text from any webpage with optional fallback to an anti-bot scraping service. Ideal for AI tools and content workflows.
š§ How it Works This sub-workflow enables reliable and clean scraping of any public webpage by simply passing a url parameter. It is designed to be embedded into other workflows or used as a tool for AI agents.
It supports two output modes: fulltext:* true ā returns { title, text } with full page content fulltext: false ā returns *{ title, url, content } with a short excerpt
š” If the site is protected by anti-bot systems (like Cloudflare), it will automatically fallback to Scrape.do, a scraping API with a generous free plan.
š§© This template requires the n8n-nodes-webpage-content-extractor community node, so it only works in self-hosted n8n environments.
š Use Cases As a reusable sub-workflow, via Execute Sub-workflow node. As a tool for an AI Agent, compatible with Call n8n Workflow Tool.
Perfect for chatbots, summarization workflows, or RSS/feed enrichment. Empowers your AI Agent with the ability to browse and extract readable content from websites automatically.
š Parameters url (string): the webpage URL to scrape fulltext (boolean): set true for full page content, false for summarized output
āļø Setup Install the community node n8n-nodes-webpage-content-extractor in your self-hosted n8n instance. Create a free account at Scrape.do and obtain your API Token. In the workflow, locate the Scrape.do HTTP Request node and configure the credentials using your API Token. Detailed step-by-step instructions are available in the workflow notes.
The Scrape.do API is only used as a fallback when conventional scraping fails, helping you preserve your API credits.
Related Templates
Extract Title tag and Meta description from url for SEO analysis with Airtable
Extract Title tag and meta description from url for SEO analysis. How it works The workflows takes records from Airtabl...
Restore your workflows from GitHub
This workflow restores all n8n instance workflows from GitHub backups using the n8n API node. It complements the Backup ...
Build a Restaurant Voice Assistant with VAPI and PostgreSQL for Bookings & Orders
This n8n template demonstrates how to create a comprehensive voice-powered restaurant assistant that handles table reser...
š Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments