Scrape Idealista 🏠 Real Estate Property Listings with ScrapeGraph AI 🕷️

This workflow automates the process of scraping real estate listings from *Idealista (or similar property portals), extracting structured property data using AI, and storing the results directly into Google Sheets.

It is designed to handle paginated listing pages, collect individual property URLs, extract detailed listing information, and continuously build a structured real estate database with minimal manual effort.

Key Advantages

  1. ✅ Fully Automated Real Estate Data Collection Automatically navigates through multiple listing pages, extracts property URLs, and retrieves detailed property information without manual browsing.

  2. ✅AI-Powered Data Extraction Uses ScrapeGraphAI to intelligently extract structured information such as:

Property title Description Price Area (sqm) Bedrooms & bathrooms Floor and room count Balcony, terrace, cellar Heating and air conditioning Property image URLs

  1. ✅Scalable Pagination Handling Dynamically generates paginated URLs, allowing the workflow to scrape hundreds or thousands of listings efficiently.

  2. ✅Google Sheets Integration Automatically writes and updates extracted property data into Google Sheets, creating a centralized and continuously updated real estate database.

  3. ✅Duplicate Prevention Uses the property URL as a unique identifier to append or update listings without creating duplicates.

  4. ✅Highly Customizable The workflow can be adapted to:

Different cities or search filters Other real estate websites Different extraction schemas Alternative storage systems (CRM, database, Airtable, etc.)

  1. ✅Structured JSON Schema Extraction Ensures consistent and reliable data formatting, making the output ready for:

Market analysis Lead generation CRM enrichment Investment scouting Real estate dashboards

  1. ✅Low-Code & Modular Architecture Built entirely inside n8n with reusable modules, making maintenance and future upgrades simple.

Ideal Use Cases

Real estate lead generation Property market monitoring Investment opportunity analysis Building property databases Real estate CRM automation Competitor and pricing analysis Automated property aggregation platforms

How it works

This workflow automates the extraction of real estate listings from Idealista by performing two main phases: listing URL discovery and detailed data extraction.

Trigger and Pagination Setup
A Manual Trigger starts the workflow. A Set node defines the base search URL and the maximum number of pages to scrape. A Code node then generates the paginated URLs (e.g., .../lista-1.htm, .../lista-2.htm).

Extract Listing URLs from Search Pages
The generated URLs are split into batches using a Split In Batches node. For each search page, a ScrapegraphAI node extracts all individual property URLs that match the pattern https://www.idealista.it/immobile/xxxx. The results are then aggregated and unified using an Aggregate and a Code node to remove duplicates and flatten the list.

Process Each Property URL
The unified list of property URLs is split again into batches. For each property URL, a second ScrapegraphAI node extracts detailed information following a strict JSON schema (including title, description, price, area, bedrooms, bathrooms, floor, rooms, balcony, terrace, cellar, heating, air conditioning, and image URLs).

Store Data in Google Sheets
The extracted data is finally written to a Google Sheet using the Google Sheets node configured with appendOrUpdate mode, which avoids duplicates by matching the URL column.

Set up steps

Import and Configure Credentials
Import the workflow into n8n. Add the following credentials: ScrapegraphAI API (used by both ScrapegraphAI nodes). Google Sheets OAuth2 (used for writing data).

Prepare the Google Sheet
Clone this template sheet or create your own. Update the Google Sheets node with your Document ID and Sheet Name.

Configure the Search Parameters
In the Set params node, modify the url variable to target your desired search (location, filters, etc.) and set max_pages to control how many search result pages to scrape.

Adjust Extraction Logic (if needed)
Verify that the Scrape listings node’s prompt correctly matches the listing URL structure of your target site. Update the Extract data node’s outputSchema (JSON schema) to match the fields you want to extract.

Enable and Execute
Activate the workflow. Click the Execute Workflow button to start scraping. The results will automatically populate the configured Google Sheet, appending new listing data without creating duplicates.

👉 Subscribe to my new YouTube channel. Here I’ll share videos and Shorts with practical tutorials and FREE templates for n8n.

Need help customizing?
Contact me for consulting and support or add me on Linkedin.

0
Downloads
0
Views
7.44
Quality Score
beginner
Complexity
Author:Davide(View Original →)
Created:5/8/2026
Updated:5/8/2026

🔒 Please log in to import templates to n8n and favorite templates

Workflow Visualization

Loading...

Preparing workflow renderer

Comments (0)

Login to post comments