ScrapingBee and Google Sheets Integration Template

Name: ScrapingBee and Google Sheets Integration Template
Availability: InStock
Rating: 0.4 (1 reviews)
Author: Sahil Sunny

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

This workflow allows users to extract sitemap links using ScrapingBee API. It only needs the domain name www.example.com and it automatically checks robots.txt and sitemap.xml to find the links. It is also designed to recursively run the workflow when new .xml links are found while scraping the sitemap.

How It Works Trigger: The workflow waits for a webhook request that contains domain=www.example.com It then looks for robots.txt file, if not found it checks sitemap.xml Once it finds xml links, it recursively scrapes them to extract the website links For each xml file, first it checks whether it's a binary file and whether it's a compressed xml If it's a text response, it directly runs a code that extracts normal website link and another code to extract xml links If it's a binary that is not compressed, it just extracts text from the binary and then extract webiste links and xml links If it's a compressed binary, it first decompresses it and then extracts the text and then the links and xml After extracting website links, it appends those links directly to a sheet After extracting xml links, it scrapes them recursively until it finds all website links

When the workflow is finished, you will see the output in the links column of the Google Sheet that we added to the workflow.

Set Up Steps Get your ScrapingBee API Key here Create a new google sheet with an empty column named links. Connect to the sheet by signing in using your Google Credential and add the link to your sheet. Copy the webhook url, and send a GET request with domain as query parameter. Example: curl "https://webhook_link?domain=scrapingbee.com" Customisation Options If the website you are scraping is blocking your request, you can try using premium or stealth proxy in Scrape robots.txt file, Scrape sitemap.xml file, and Scrape xml file nodes. If you wish to store the data in a different app/tool or store it as a file, you would just need to replace Append links to sheet node with a relevant node. Next Steps If you wish to scrape the pages using the extracted links, then you can implement a new workflow that reads the sheet or file (output generated by this workflow) for links and for each link send a request to ScrapingBee's HTML API and save the returned data.

NOTE: Some heavy sitemaps could result in a crash if the workflow consumes more memory than what is available in your n8n plan or self-hosted system. If this happens, we would recommend you to either upgrade your plan or use a self-hosted solution with a higher memory.

0

Downloads

2

Views

8.18

Quality Score

beginner

Complexity

Category:Data Processing

Author:Sahil Sunny(View Original →)

Created:9/10/2025

Updated:2/14/2026

Related Templates

Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search

Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...

Data Processing3 downloads

USDT And TRC20 Wallet Tracker API Workflow for n8n

Overview This n8n workflow is specifically designed to monitor USDT TRC20 transactions within a specified wallet. It u...

Data Processing0 downloads

Add product ideas to Google Sheets via a Slack

Use Case This workflow is a slight variation of a workflow we're using at n8n. In most companies, employees have a lot o...

ScrapingBee and Google Sheets Integration Template

Tags

Related Templates

Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search

USDT And TRC20 Wallet Tracker API Workflow for n8n

Add product ideas to Google Sheets via a Slack

Workflow Visualization

Loading...

Comments (0)