Job Post to Sales Lead Pipeline with Scrape.do, Apollo.io & OpenAI
Lead Sourcing by Job Posts For Outreach With Scrape.do API & Open AI & Google Sheets
Overview
This n8n workflow automates the complete lead generation process by scraping job postings from Indeed, enriching company data via Apollo.io, identifying decision-makers, and generating personalized LinkedIn outreach messages using OpenAI. It integrates with Scrape.do for reliable web scraping, Apollo.io for B2B data enrichment, OpenAI for AI-powered personalization, and Google Sheets for centralized data storage.
Perfect for: Sales teams, recruiters, business development professionals, and marketing agencies looking to automate their outbound prospecting pipeline.
Workflow Components
- ā° Schedule Trigger
| Property | Value | |----------|-------| | Type | Schedule Trigger | | Purpose | Automatically initiates workflow on a recurring schedule | | Frequency | Weekly (Every Monday) | | Time | 00:00 UTC |
Function: Ensures consistent, hands-off lead generation by running the pipeline automatically without manual intervention.
- š Scrape.do Indeed API
| Property | Value | |----------|-------| | Type | HTTP Request (GET) | | Purpose | Scrapes job listings from Indeed via Scrape.do proxy API | | Endpoint | https://api.scrape.do | | Output Format | Markdown |
Request Parameters:
| Parameter | Value | Description | |-----------|-------|-------------| | token | API Token | Scrape.do authentication | | url | Indeed Search URL | Target job search page | | super | true | Uses residential proxies | | geoCode | us | US-based content | | render | true | JavaScript rendering enabled | | device | mobile | Mobile viewport for cleaner HTML | | output | markdown | Lightweight text output |
Function: Fetches Indeed job listings with anti-bot bypass, returning clean markdown for easy parsing.
- š Parse Indeed Jobs
| Property | Value | |----------|-------| | Type | Code Node (JavaScript) | | Purpose | Extracts structured job data from markdown | | Mode | Run once for all items |
Extracted Fields:
| Field | Description | Example | |-------|-------------|---------| | jobTitle | Position title | "Senior Data Engineer" | | jobUrl | Indeed job link | "https://indeed.com/viewjob?jk=abc123" | | jobId | Indeed job identifier | "abc123" | | companyName | Hiring company | "Acme Corporation" | | location | City, State | "San Francisco, CA" | | salary | Pay range | "$120,000 - $150,000" | | jobType | Employment type | "Full-time" | | source | Data source | "Indeed" | | dateFound | Scrape date | "2025-01-15" |
Function: Parses markdown using regex patterns, filters invalid entries, and deduplicates by company name.
- š Add New Company (Google Sheets)
| Property | Value | |----------|-------| | Type | Google Sheets Node | | Purpose | Stores parsed job postings for tracking | | Operation | Append rows | | Target Sheet | "Add New Company" |
Function: Creates a historical record of all discovered job postings and companies for pipeline tracking.
- š¢ Apollo Organization Search
| Property | Value | |----------|-------| | Type | HTTP Request (POST) | | Purpose | Enriches company data via Apollo.io API | | Endpoint | https://api.apollo.io/v1/organizations/search | | Authentication | HTTP Header Auth (x-api-key) |
Request Body: { "q_organization_name": "Company Name", "page": 1, "per_page": 1 }
Response Fields:
| Field | Description | |-------|-------------| | id | Apollo organization ID | | name | Official company name | | website_url | Company website | | linkedin_url | LinkedIn company page | | industry | Business sector | | estimated_num_employees | Company size | | founded_year | Year established | | city, state, country | Location details | | short_description | Company overview |
Function: Retrieves comprehensive company intelligence including LinkedIn profiles, industry classification, and employee count.
- š¤ Extract Apollo Org Data
| Property | Value | |----------|-------| | Type | Code Node (JavaScript) | | Purpose | Parses Apollo response and merges with original data | | Mode | Run once for each item |
Function: Extracts relevant fields from Apollo API response and combines with job posting data for downstream processing.
- š„ Apollo People Search
| Property | Value | |----------|-------| | Type | HTTP Request (POST) | | Purpose | Finds decision-makers at target companies | | Endpoint | https://api.apollo.io/v1/mixed_people/search | | Authentication | HTTP Header Auth (x-api-key) |
Request Body: { "organization_ids": ["apollo_org_id"], "person_titles": [ "CTO", "Chief Technology Officer", "VP Engineering", "Head of Engineering", "Engineering Manager", "Technical Director", "CEO", "Founder" ], "page": 1, "per_page": 3 }
Response Fields:
| Field | Description | |-------|-------------| | first_name | Contact first name | | last_name | Contact last name | | title | Job title | | email | Email address | | linkedin_url | LinkedIn profile URL | | phone_number | Direct phone |
Function: Identifies key stakeholders and decision-makers based on configurable title filters.
- š Format Leads
| Property | Value | |----------|-------| | Type | Code Node (JavaScript) | | Purpose | Structures lead data for outreach | | Mode | Run once for all items |
Function: Combines person data with company context, creating comprehensive lead profiles ready for personalization.
- š¤ Generate Personalized Message (OpenAI)
| Property | Value | |----------|-------| | Type | OpenAI Node | | Purpose | Creates custom LinkedIn connection messages | | Model | gpt-4o-mini | | Max Tokens | 150 | | Temperature | 0.7 |
System Prompt: You are a professional outreach specialist. Write personalized LinkedIn connection request messages. Keep messages under 300 characters. Be friendly, professional, and mention a specific reason for connecting based on their role and company.
User Prompt Variables:
| Variable | Source | |----------|--------| | Name | $json.fullName | | Title | $json.title | | Company | $json.companyName | | Industry | $json.industry | | Job Context | $json.jobTitle |
Function: Generates unique, contextual outreach messages that reference specific hiring activity and company details.
- š Merge Lead + Message
| Property | Value | |----------|-------| | Type | Code Node (JavaScript) | | Purpose | Combines lead data with generated message | | Mode | Run once for each item |
Function: Merges OpenAI response with lead profile, creating the final enriched record.
- š¾ Save Leads to Sheet
| Property | Value | |----------|-------| | Type | Google Sheets Node | | Purpose | Stores final lead data with personalized messages | | Operation | Append rows | | Target Sheet | "Leads" |
Data Mapping:
| Column | Data | |--------|------| | First Name | Lead's first name | | Last Name | Lead's last name | | Title | Job title | | Company | Company name | | LinkedIn URL | Profile link | | Country | Location | | Industry | Business sector | | Date Added | Timestamp | | Source | "Indeed + Apollo" | | Personalized Message | AI-generated outreach text |
Function: Creates actionable lead database ready for outreach campaigns.
Workflow Flow
ā° Schedule Trigger ā ā¼ š Scrape.do Indeed API āāāŗ Fetches job listings with JS rendering ā ā¼ š Parse Indeed Jobs āāāŗ Extracts company names, job details ā ā¼ š Add New Company āāāŗ Saves to Google Sheets (Companies) ā ā¼ š¢ Apollo Org Search āāāŗ Enriches company data ā ā¼ š¤ Extract Apollo Org Data āāāŗ Parses API response ā ā¼ š„ Apollo People Search āāāŗ Finds decision-makers ā ā¼ š Format Leads āāāŗ Structures lead profiles ā ā¼ š¤ Generate Personalized Message āāāŗ AI creates custom outreach ā ā¼ š Merge Lead + Message āāāŗ Combines all data ā ā¼ š¾ Save Leads to Sheet āāāŗ Final storage (Leads)
Configuration Requirements
API Keys & Credentials
| Credential | Purpose | Where to Get | |------------|---------|--------------| | Scrape.do API Token | Web scraping with anti-bot bypass | scrape.do/dashboard | | Apollo.io API Key | B2B data enrichment | apollo.io/settings/integrations | | OpenAI API Key | AI message generation | platform.openai.com | | Google Sheets OAuth2 | Data storage | n8n Credentials Setup |
n8n Credential Setup
| Credential Type | Configuration | |-----------------|---------------| | HTTP Header Auth (Apollo) | Header: x-api-key, Value: Your Apollo API key | | OpenAI API | API Key: Your OpenAI API key | | Google Sheets OAuth2 | Complete OAuth flow with Google |
Key Features
š Intelligent Job Scraping
Anti-Bot Bypass:** Residential proxy rotation via Scrape.do JavaScript Rendering:** Full headless browser for dynamic content Mobile Optimization:** Cleaner HTML with mobile viewport Markdown Output:** Lightweight, easy-to-parse format
š¢ B2B Data Enrichment
Company Intelligence:** Industry, size, location, LinkedIn Decision-Maker Discovery:** Title-based filtering Contact Information:** Email, phone, LinkedIn profiles Real-Time Data:** Fresh information from Apollo.io
š¤ AI-Powered Personalization
Contextual Messages:** References specific hiring activity Character Limit:** Optimized for LinkedIn (300 chars) Variable Temperature:** Balanced creativity and consistency Role-Specific:** Tailored to recipient's title and company
š Automated Data Management
Dual Sheet Storage:** Companies + Leads separation Timestamp Tracking:** Historical records Deduplication:** Prevents duplicate entries Ready for Export:** CSV-compatible format
Use Cases
šÆ Sales Prospecting
Identify companies actively hiring in your target market Find decision-makers at companies investing in growth Generate personalized cold outreach at scale Track pipeline from discovery to contact
š„ Recruiting & Talent Acquisition
Monitor competitor hiring patterns Identify companies building specific teams Connect with hiring managers directly Build talent pipeline relationships
š Market Intelligence
Track industry hiring trends Monitor competitor expansion signals Identify emerging market opportunities Benchmark salary ranges by role
š¤ Partnership Development
Find companies investing in complementary areas Identify potential integration partners Connect with technical leadership Build strategic relationship pipeline
Technical Notes
| Specification | Value | |---------------|-------| | Processing Time | 2-5 minutes per run (depending on job count) | | Jobs per Run | ~25 unique companies | | API Calls per Run | 1 Scrape.do + 25 Apollo Org + 25 Apollo People + ~75 OpenAI | | Data Accuracy | 90%+ for company matching | | Success Rate | 99%+ with proper error handling |
Rate Limits to Consider
| Service | Free Tier Limit | Recommendation | |---------|-----------------|----------------| | Scrape.do | 1,000 credits/month | ~40 runs/month | | Apollo.io | 100 requests/day | Add Wait nodes if needed | | OpenAI | Based on usage | Monitor costs (~$0.01-0.05/run) | | Google Sheets | 300 requests/minute | No issues expected |
Setup Instructions
Step 1: Import Workflow
Copy the JSON workflow configuration In n8n: Workflows ā Import from JSON Paste configuration and save
Step 2: Configure Scrape.do
Sign up at scrape.do Navigate to Dashboard ā API Token Copy your token Token is embedded in URL query parameter (already configured)
To customize search: Change the url parameter in "Scrape.do Indeed API" node: q=data+engineer (search term) l=Remote (location) fromage=7 (last 7 days)
Step 3: Configure Apollo.io
Sign up at apollo.io Go to Settings ā Integrations ā API Keys Create new API key In n8n: Credentials ā Add Credential ā Header Auth Name: x-api-key Value: Your Apollo API key Select this credential in both Apollo HTTP nodes
Step 4: Configure OpenAI
Go to platform.openai.com Create new API key In n8n: Credentials ā Add Credential ā OpenAI Paste API key Select credential in "Generate Personalized Message" node
Step 5: Configure Google Sheets
Create new Google Spreadsheet Create two sheets: Sheet 1: "Add New Company" Columns: companyName | jobTitle | jobUrl | location | salary | source | postedDate Sheet 2: "Leads" Columns: First Name | Last Name | Title | Company | LinkedIn URL | Country | Industry | Date Added | Source | Personalized Message Copy Sheet ID from URL In n8n: Credentials ā Add Credential ā Google Sheets OAuth2 Update both Google Sheets nodes with your Sheet ID
Step 6: Test and Activate
Manual Test: Click "Execute Workflow" button Verify Each Node: Check outputs step by step Review Data: Confirm data appears in Google Sheets Activate: Toggle workflow to "Active"
Error Handling
Common Issues
| Issue | Cause | Solution | |-------|-------|----------| | "Invalid character: " | Empty/malformed company name | Check Parse Indeed Jobs output | | "Node does not have credentials" | Credential not linked | Open node ā Select credential | | Empty Parse Results | Indeed HTML structure changed | Check Scrape.do raw output | | Apollo Rate Limit (429) | Too many requests | Add 5-10s Wait node between calls | | OpenAI Timeout | Too many tokens | Reduce batch size or max_tokens | | "Your request is invalid" | Malformed JSON body | Verify expression syntax in HTTP nodes |
Troubleshooting Steps
Verify Credentials: Test each credential individually Check Node Outputs: Use "Execute Node" for debugging Monitor API Usage: Check Apollo and OpenAI dashboards Review Logs: Check n8n execution history for details Test with Sample: Use known company name to verify Apollo
Recommended Error Handling Additions
For production use, consider adding:
IF node after Apollo Org Search to handle empty results Error Workflow trigger for notifications Wait nodes between API calls for rate limiting Retry logic for transient failures
Performance Specifications
| Metric | Value | |--------|-------| | Execution Time | 2-5 minutes per scheduled run | | Jobs Discovered | ~25 per Indeed page | | Leads Generated | 1-3 per company (based on title matches) | | Message Quality | Professional, contextual, <300 chars | | Data Freshness | Real-time from Indeed + Apollo | | Storage Format | Google Sheets (unlimited rows) |
API Reference
Scrape.do API
| Endpoint | Method | Purpose | |----------|--------|---------| | https://api.scrape.do | GET | Direct URL scraping |
Documentation: [scrape.do/documentation
Apollo.io API
| Endpoint | Method | Purpose | |----------|--------|---------| | /v1/organizations/search | POST | Company lookup | | /v1/mixed_people/search | POST | People search |
Documentation: apolloio.github.io/apollo-api-docs
OpenAI API
| Endpoint | Method | Purpose | |----------|--------|---------| | /v1/chat/completions | POST | Message generation |
Documentation: [platform.openai.com
Related Templates
Send structured logs to BetterStack from any workflow using HTTP Request
Send structured logs to BetterStack from any workflow using HTTP Request Who is this for? This workflow is perfect for...
Provide latest euro exchange rates from European Central Bank via Webhook
What is this workflow doing? This simple workflow is pulling the latest Euro foreign exchange reference rates from the E...
Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search
Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...
š Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments