Extract Clean Web Content with Anti-Bot Fallback for AI Agents & Workflows
This workflow contains community nodes that are only compatible with the self-hosted version of n8n.
Clean Web Content Extraction with Anti-Bot Fallback Extract clean and structured text from any webpage with optional fallback to an anti-bot scraping service. Ideal for AI tools and content workflows.
š§ How it Works This sub-workflow enables reliable and clean scraping of any public webpage by simply passing a url parameter. It is designed to be embedded into other workflows or used as a tool for AI agents.
It supports two output modes: fulltext:* true ā returns { title, text } with full page content fulltext: false ā returns *{ title, url, content } with a short excerpt
š” If the site is protected by anti-bot systems (like Cloudflare), it will automatically fallback to Scrape.do, a scraping API with a generous free plan.
š§© This template requires the n8n-nodes-webpage-content-extractor community node, so it only works in self-hosted n8n environments.
š Use Cases As a reusable sub-workflow, via Execute Sub-workflow node. As a tool for an AI Agent, compatible with Call n8n Workflow Tool.
Perfect for chatbots, summarization workflows, or RSS/feed enrichment. Empowers your AI Agent with the ability to browse and extract readable content from websites automatically.
š Parameters url (string): the webpage URL to scrape fulltext (boolean): set true for full page content, false for summarized output
āļø Setup Install the community node n8n-nodes-webpage-content-extractor in your self-hosted n8n instance. Create a free account at Scrape.do and obtain your API Token. In the workflow, locate the Scrape.do HTTP Request node and configure the credentials using your API Token. Detailed step-by-step instructions are available in the workflow notes.
The Scrape.do API is only used as a fallback when conventional scraping fails, helping you preserve your API credits.
Related Templates
Convert Tour PDFs to Vector Database using Google Drive, LangChain & OpenAI
š§© Workflow: Process Tour PDF from Google Drive to Pinecone Vector DB with OpenAI Embeddings Overview This workflow au...
Provide latest euro exchange rates from European Central Bank via Webhook
What is this workflow doing? This simple workflow is pulling the latest Euro foreign exchange reference rates from the E...
Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search
Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...
š Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments