Extract Clean Web Content with Anti-Bot Fallback for AI Agents & Workflows
This workflow contains community nodes that are only compatible with the self-hosted version of n8n.
Clean Web Content Extraction with Anti-Bot Fallback Extract clean and structured text from any webpage with optional fallback to an anti-bot scraping service. Ideal for AI tools and content workflows.
š§ How it Works This sub-workflow enables reliable and clean scraping of any public webpage by simply passing a url parameter. It is designed to be embedded into other workflows or used as a tool for AI agents.
It supports two output modes: fulltext:* true ā returns { title, text } with full page content fulltext: false ā returns *{ title, url, content } with a short excerpt
š” If the site is protected by anti-bot systems (like Cloudflare), it will automatically fallback to Scrape.do, a scraping API with a generous free plan.
š§© This template requires the n8n-nodes-webpage-content-extractor community node, so it only works in self-hosted n8n environments.
š Use Cases As a reusable sub-workflow, via Execute Sub-workflow node. As a tool for an AI Agent, compatible with Call n8n Workflow Tool.
Perfect for chatbots, summarization workflows, or RSS/feed enrichment. Empowers your AI Agent with the ability to browse and extract readable content from websites automatically.
š Parameters url (string): the webpage URL to scrape fulltext (boolean): set true for full page content, false for summarized output
āļø Setup Install the community node n8n-nodes-webpage-content-extractor in your self-hosted n8n instance. Create a free account at Scrape.do and obtain your API Token. In the workflow, locate the Scrape.do HTTP Request node and configure the credentials using your API Token. Detailed step-by-step instructions are available in the workflow notes.
The Scrape.do API is only used as a fallback when conventional scraping fails, helping you preserve your API credits.
Related Templates
USDT And TRC20 Wallet Tracker API Workflow for n8n
Overview This n8n workflow is specifically designed to monitor USDT TRC20 transactions within a specified wallet. It u...
Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search
Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...
Bulk Automated Google Drive Files Sharing and Direct Download Link Generation
This N8N workflow automates the process of sharing files from Google Drive. It includes OAuth2 authentication, batch pro...
š Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments