by Ranjan Dailata
Disclaimer This template is only available on n8n self-hosted as it's making use of the community node for MCP Client. Who this is for? The Extract, Transform LinkedIn Data with Bright Data MCP Server & Google Gemini workflow is an automated solution that scrapes LinkedIn content via Bright Data MCP Server then transforms the response using a Gemini LLM. The final output is sent via webhook notification and also persisted on disk. This workflow is tailored for: Data Analysts : Who require structured LinkedIn datasets for analytics and reporting. Marketing and Sales Teams : Looking to enrich lead databases, track company updates, and identify market trends. Recruiters and Talent Acquisition Specialists : Who want to automate candidate sourcing and company research. AI Developers : Integrating real-time professional data into intelligent applications. Business Intelligence Teams : Needing current and comprehensive LinkedIn data to drive strategic decisions. What problem is this workflow solving? Gathering structured and meaningful information from the web is traditionally slow, manual, and error-prone. This workflow solves: Reliable web scraping using Bright Data MCP Server LinkedIn tools. LinkedIn person and company web scrapping with AI Agents setup with the Bright Data MCP Server tools. Data extraction and transformation with Google Gemini LLM. Persists the LinkedIn person and company info to disk. Performs a Webhook notification with the LinkedIn person and company info. What this workflow does? This n8n workflow performs the following steps: Trigger: Start manually. Input URL(s): Specify the LinkedIn person and company URL. Web Scraping (Bright Data): Use Bright Data's MCP Server, LinkedIn tools for the person and company data extract. Data Transformation & Aggregation: Uses the Google LLM for handling the data transformation. Store / Output: Save results into disk and also performs a Webhook notification. Pre-conditions Knowledge of Model Context Protocol (MCP) is highly essential. Please read this blog post - model-context-protocol You need to have the Bright Data account and do the necessary setup as mentioned in the Setup section below. You need to have the Google Gemini API Key. Visit Google AI Studio You need to install the Bright Data MCP Server @brightdata/mcp You need to install the n8n-nodes-mcp Setup Please make sure to setup n8n locally with MCP Servers by navigating to n8n-nodes-mcp Please make sure to install the Bright Data MCP Server @brightdata/mcp on your local machine. Sign up at Bright Data. Navigate to Proxies & Scraping and create a new Web Unlocker zone by selecting Web Unlocker API under Scraping Solutions. Create a Web Unlocker proxy zone called mcp_unlocker on Bright Data control panel. In n8n, configure the Google Gemini(PaLM) Api account with the Google Gemini API key (or access through Vertex AI or proxy). In n8n, configure the credentials to connect with MCP Client (STDIO) account with the Bright Data MCP Server as shown below. Make sure to copy the Bright Data API_TOKEN within the Environments textbox above as API_TOKEN=<your-token>. Update the LinkedIn URL person and company workflow. Update the Webhook HTTP Request node with the Webhook endpoint of your choice. Update the file name and path to persist on disk. How to customize this workflow to your needs Different Inputs: Instead of static URLs, accept URLs dynamically via webhook or form submissions. Data Extraction: Modify the LinkedIn Data Extractor node with the suitable prompt to format the data as you wish. Outputs: Update the Webhook endpoints to send the response to Slack channels, Airtable, Notion, CRM systems, etc.
by Ranjan Dailata
Disclaimer This template is only available on n8n self-hosted as it's making use of the community node for MCP Client. Who this is for? The Scrape Web Data with Bright Data and MCP Automated AI Agent workflow is built for professionals who need to automate large-scale, intelligent data extraction by utilizing the Bright Data MCP Server and Google Gemini. This solution is ideal for: Data Analysts - Who require structured, enriched datasets for analysis and reporting. Marketing Researchers - Seeking fresh market intelligence from dynamic web sources. Product Managers - Who want competitive product and feature insights from various websites. AI Developers - Aiming to feed web data into downstream machine learning models. Growth Hackers - Looking for high-quality data to fuel campaigns, research, or strategic targeting. What problem is this workflow solving? Manually scraping websites, cleaning raw HTML data, and generating useful insights from it can be slow, error-prone, and non-scalable. This workflow solves these problems by: Automating complex web data extraction through Bright Data’s MCP Server. Reducing the human effort needed for cleaning, parsing, and analyzing unstructured web content. Allowing seamless integration into further automation processes. What this workflow does? This n8n workflow performs the following steps: Trigger: Start manually. Input URL(s): Specify the URL to perform the web scrapping. Web Scraping (Bright Data): Use Bright Data’s MCP Server tools to accomplish the web data scrapping with markdown and html format. Store / Output: Save results into disk and also performs a Webhook notification. Setup Please make sure to setup n8n locally with MCP Servers by navigating to n8n-nodes-mcp Please make sure to install the Bright Data MCP Server @brightdata/mcp on your local machine. Sign up at Bright Data. Create a Web Unlocker proxy zone called mcp_unlocker on Bright Data control panel. Navigate to Proxies & Scraping and create a new Web Unlocker zone by selecting Web Unlocker API under Scraping Solutions. In n8n, configure the Google Gemini(PaLM) Api account with the Google Gemini API key (or access through Vertex AI or proxy). In n8n, configure the credentials to connect with MCP Client (STDIO) account with the Bright Data MCP Server as shown below. Make sure to copy the Bright Data API_TOKEN within the Environments textbox above as API_TOKEN=<your-token>. Update the LinkedIn URL person and company workflow. Update the Webhook HTTP Request node with the Webhook endpoint of your choice. Update the file name and path to persist on disk. How to customize this workflow to your needs Different Inputs: Instead of static URLs, accept URLs dynamically via webhook or form submissions. Outputs: Update the Webhook endpoints to send the response to Slack channels, Airtable, Notion, CRM systems, etc.
by Ranjan Dailata
Disclaimer This template is only available on n8n self-hosted as it's making use of the community node for MCP Client. Who this is for? The Chat Conversations with Bright Data MCP Search Engines & Google Gemini workflow is designed for users who need real-time, AI-enhanced conversations powered by live search engine results. This workflow is tailored for: Data Analysts - Who want live, search-based data fused with AI reasoning. Marketing Researchers - Seeking up-to-the-minute market or competitor insights via conversational AI. Product Managers - Exploring user needs, market trends, and competitor analysis in real time. AI Developers - Building dynamic applications that combine live search data with intelligent conversation agents. Growth Hackers - Who need fast, conversational research tools for campaign ideation, outreach, or content creation. What problem is this workflow solving? Traditional chatbots and AI systems often rely on static, outdated data. This workflow enables AI agents to fetch live search engine data and converse intelligently about it, making interactions dynamic, accurate, and highly contextual. This workflow solves the major gaps of: Outdated Knowledge: Regular chatbots lack up-to-date information from live web searches. Manual Search Fatigue: Manually searching for information and interpreting it is time-consuming. Context Bridging: Connecting search results into meaningful, conversational replies requires human-level reasoning. What this workflow does? Accepts a user's conversational query input. Triggers a search request to Bright Data’s MCP Search Engines API (Google, Bing, etc.) based on the query. Waits for the search task to complete. Retrieves real-time search results. Feeds the search results and original question into Google Gemini. Generates a human-like, contextually accurate AI response combining live information and conversational flow. Outputs the response back into a chat app. Pre-conditions Knowledge of Model Context Protocol (MCP) is highly essential. Please read this blog post - model-context-protocol You need to have the Bright Data account and do the necessary setup as mentioned in the Setup section below. You need to have the Google Gemini API Key. Visit Google AI Studio You need to install the Bright Data MCP Server @brightdata/mcp You need to install the n8n-nodes-mcp Setup Please make sure to setup n8n locally with MCP Servers by navigating to n8n-nodes-mcp Please make sure to install the Bright Data MCP Server @brightdata/mcp on your local machine. Also, do "Account Setup" as mentioned in the @brightdata/mcp URL. Sign up at Bright Data. Navigate to Proxies & Scraping and create a new Web Unlocker zone by selecting Web Unlocker API under Scraping Solutions. In n8n, configure the Google Gemini(PaLM) Api account with the Google Gemini API key (or access through Vertex AI or proxy). In n8n, configure the credentials to connect with MCP Client (STDIO) account with the Bright Data MCP Server as shown below. Make sure to copy the Bright Data Web Unlocker API Token within the Environments textbox above as API_TOKEN=<your-token>. Update the HTTP Request for Webhook Notification node for sending the Webhook notification for chat responses. How to customize this workflow to your needs Change Search Engine: Add or Remove the Search Engine MCP tools based upon the Bright Data MCP Server updates. Expand Outputs: Send AI chat responses to Slack, Discord, custom chat UIs, WhatsApp, or CRM systems. Store conversation logs in a database (PostgreSQL, MongoDB, etc.) for future audits or training.
by Ranjan Dailata
Who this is for? The Structured Data Extract & Data Mining workflow is crafted for researchers, content analysts, SEO strategists, and AI developers who need to transform semi-structured web data (like markdown content or scraped HTML) into actionable structured datasets. It is ideal for: Content Analysts** - Organizing and mining large volumes of markdown or HTML content. SEO & Trend Researchers** - Exploring topics by location and category. AI Engineers & NLP Developers** - Looking to automate insight extraction from unstructured inputs. Growth Marketers** - Tracking topic-level trends for strategic campaigns. Automation Specialists** - Streamlining workflows from scrape to storage. What problem is this workflow solving? Extracting insights from markdown or HTML documents typically requires manual review, formatting, and parsing. This becomes unscalable when dealing with large datasets or when real-time response is needed. Additionally, trend and topic extraction usually involves external tools, custom scripts, and inconsistent formatting. This workflow solves: Automatic text extraction from markdown or structured content. Location and category-based trend mining with semantic grouping. AI-driven topic extraction and summarization Real-time notification via webhook with rich structured payloads. Persistent storage of mined data to disk for audits or further processing. What this workflow does Receives input: Sets the URL for the data extraction and analysis. Uses Bright Data's Web Unlocker to extract content from relevant sites. A Markdown/Text Extractor node parses the content into clean plaintext The cleaned data is passed to Google Gemini to: Identify trends by location and category Extract key topics and themes Format the response into structured JSON The structured insights are sent via Webhook Notification to external systems (e.g., Slack, Web apps, Zapier) The final output is saved to disk Setup Sign up at Bright Data. Navigate to Proxies & Scraping and create a new Web Unlocker zone by selecting Web Unlocker API under Scraping Solutions. In n8n, configure the Header Auth account under Credentials (Generic Auth Type: Header Authentication). The Value field should be set with the Bearer XXXXXXXXXXXXXX. The XXXXXXXXXXXXXX should be replaced by the Web Unlocker Token. A Google Gemini API key (or access through Vertex AI or proxy). Update the Set URL and Bright Data Zone for setting the brand content URL and the Bright Data Zone name. Update the Webhook HTTP Request node with the Webhook endpoint of your choice. How to customize this workflow to your needs Update Source** : Update the workflow input to read from Google Sheet or Airbase for dynamically tracking multiple brands or topics. Gemini Prompt Customization** : Extract trends within a custom category (e.g., E-commerce design patterns in the US) Output topics with popularity metrics Structure the output as per your database schema (e.g., [{ topic, trend_score, location }]) Webhook Output** : Send notifications to - Slack – with AI summaries in rich blocks Internal APIs – for use in dashboards Zapier/Make – for multi-step automation Persistence** Save output to: Remote FTP or SFTP storage Amazon S3, Google Cloud Storage etc.
by Akhil Varma Gadiraju
n8n Workflow: Sync Workflows with GitLab How It Works This workflow ensures that your self-hosted n8n workflows are version-controlled in a GitLab repository. It compares each current workflow from n8n with its stored counterpart in GitLab. If any differences are detected, the GitLab file is updated with the latest version. Core Logic: Retrieve Workflows – Fetch all workflows from the n8n REST API. Compare with GitLab – For each workflow, fetch the corresponding file from GitLab and compare the JSON. Update if Changed – If differences exist, commit the updated workflow to GitLab using its API. Setup Before using the workflow, ensure the following: Prerequisites: n8n**: Self-hosted instance with access to the /rest/workflows API. GitLab**: A repository where workflows will be stored, and a Personal Access Token (PAT) with api and write_repository permissions. n8n Nodes Required**: HTTP Request (to call n8n and GitLab APIs) Code or Function nodes (for diffing and formatting) Looping (SplitInBatches or similar) Configuration: Set environment variables or workflow credentials for: GITLAB_TOKEN GITLAB_REPO GITLAB_BRANCH (e.g., main) GITLAB_FILE_PATH_PREFIX (e.g., n8n-workflows/) How to Use Import the Workflow into your n8n instance. Configure GitLab API Credentials: Set the GitLab PAT as a header in the HTTP Request node: Private-Token: {{ $env.GITLAB_TOKEN }} Map Workflows to GitLab Paths: Use the workflow name or ID to create the file path. Example: n8n-workflows/workflow-name.json Trigger the Workflow: Can be manually triggered, or scheduled to run at intervals (e.g., daily). Review Commits in GitLab: Each updated workflow will be committed with a message like: "Update workflow: Sample Workflow" Disclaimer This workflow does not handle merge conflicts or manual edits made directly in GitLab. Always ensure proper coordination if multiple sources are modifying workflows. Only structural changes are tracked. Non-functional metadata (like timestamps or IDs) may trigger false positives unless filtered. Use at your own risk. Test in a safe environment before applying to production workflows.
by Joseph
(Image Generation → Hosting → Video Generation) This workflow is designed for creators, automation enthusiasts, and indie hackers who want to generate image-based videos automatically using AI tools — at a low cost. ⚙️ Workflow Overview This automation performs the following steps: Trigger (Schedule or manual) Generate an image using Flux (choose between two APIs) Upload the image to Kraken.io to get a public URL Send the image to Runway ML (choose between two APIs) to generate a video Receive the video as a URL — ready for posting, download, or further automation 🛠️ Step-by-Step Setup 🖼️ Flux (Image Generation) You can use either of the following providers: Option 1: Flux by BlackForest Labs (Direct API) 🔑 Get your API key here: https://docs.bfl.ml/ Paste your API key in the HTTP Request node named Flux (Blackforest) You can customize prompts or styles inside the JSON body Option 2: Flux via RapidAPI 🔑 Subscribe and get your key here: https://rapidapi.com/poorav925/api/ai-text-to-image-generator-flux-free-api/playground/apiendpoint\_e38039ee-1912-4ef9-b4d4-270d72fca851 Enter your RapidAPI key in the X-RapidAPI-Key header Optional: tweak prompts, style, or resolution inside the JSON body 🐙 Kraken.io (Hosting the Image Publicly) Runway ML requires the image to be publicly accessible. We use Kraken.io to host the generated image and return a public URL. 🔑 Get your API credentials: https://kraken.io/account/api-credentials Setup: Copy your API Key and API Secret Open the Kraken Upload node in n8n Replace placeholders with your credentials The node uploads your image and gives back a public image URL for Runway to use 🎬 RunwayML (Video Generation) You also have two options here: Option 1: Runway Official API 🔑 Get your credentials at: https://dev.runwayml.com/ Use the public image URL from Kraken in the JSON body Paste your Bearer token in the Authorization header Customize other settings like video length, style, FPS, etc. Option 2: Runway via RapidAPI 🔑 Subscribe and get your key here: https://rapidapi.com/fortunehoppers/api/runwayml/playground/apiendpoint\_93c8554d-8097-40cd-8252-3d4dec9c0e68 Paste your RapidAPI key in the request header Customize prompt and generation options in the body Use the Kraken-generated image URL as the input source 📤 What to Do with the Video Once the video is generated, you’ll get a direct video URL. You can: Save it to Google Sheets or Notion Send it via email Trigger a YouTube upload automation Or download manually for editing and reposting 💡 Optional Tips & Notes You can schedule this workflow to generate AI videos daily or weekly Combine it with a Google Sheet of prompts for bulk automation Try using a consistent visual style or theme for better branding This workflow is lightweight and affordable — perfect for indie projects or experimental content generation Great for shorts, quote visuals, music loops, AI art promos, etc. 🔗 Resources Flux (Blackforest) Docs Flux on RapidAPI RunwayML Official Docs Runway on RapidAPI Kraken.io API Dashboard 🙋 Need Help? Feel free to reach out: 🐦 Twitter: @juppfy 📧 Email: joseph@uppfy.com If you’d like to hire me for custom n8n workflows or product automations, don’t hesitate to get in touch.
by Max aka Mosheh
How it works: The n8n flow grabs the needed IDs, fetches the current links, adds your new one, and sends a single HTTP request to NocoDB to update the record’s linked entries. Set up steps: Plan for 10 minutes setup if you’re already running n8n and NocoDB. You’ll need to copy/paste table IDs, set up your HTTP node, and test once. No coding, just copy IDs.
by scrapeless official
AI-Powered Web Data Pipeline with n8n How It Works This n8n workflow builds an AI-powered web data pipeline that automates the entire process of: Extraction** Structuring** Vectorization** Storage** It integrates multiple advanced tools to transform messy web pages into clean, searchable vector databases. Integrated Tools Scrapeless** Bypasses JavaScript-heavy websites and anti-bot protections to reliably extract HTML content. Claude AI** Uses LLMs to analyze unstructured HTML and generate clean, structured JSON data. Ollama Embeddings** Generates local vector embeddings from structured text using the all-minilm model. Qdrant Vector DB** Stores semantic vector data for fast and meaningful search capabilities. Webhook Notifications** Sends real-time updates when workflows complete or errors occur. From messy webpages to structured vector data — this pipeline is perfect for building intelligent agents, knowledge bases, or research automation tools. Setup Steps 1. Install n8n > Requires Node.js v18 / v20 / v22 npm install -g n8n n8n After installation, access the n8n interface via: URL: http://localhost:5678 2. Set Up Scrapeless Register at: Scrapeless Copy your API token Paste the token into the HTTP Request node labeled "Scrapeless Web Request" 3. Set Up Claude API (Anthropic) Sign up at Anthropic Console Generate your Claude API key Add the API key to the following nodes: Claude Extractor AI Data Checker Claude AI Agent 4. Install and Run Ollama macOS brew install ollama Linux curl -fsSL https://ollama.com/install.sh | sh Windows Download the installer from: https://ollama.com Start Ollama Server ollama serve Pull Embedding Model ollama pull all-minilm 5. Install Qdrant (via Docker) docker pull qdrant/qdrant docker run -d \ --name qdrant-server \ -p 6333:6333 -p 6334:6334 \ -v $(pwd)/qdrant_storage:/qdrant/storage \ qdrant/qdrant Test if Qdrant is running: curl http://localhost:6333/healthz 6. Configure the n8n Workflow Modify the Trigger (Manual or Scheduled) Input your Target URLs and Collection Name in the designated nodes Paste all required API Tokens / Keys into their corresponding nodes Ensure your Qdrant and Ollama services are running Ideal Use Cases Custom AI Chatbots Private Search Engines Research Tools Internal Knowledge Bases Content Monitoring Pipelines
by InfraNodus
This template can be used to find the content gaps in your competitors' discourse: identifying the topics they are not yet connecting and giving you an opportunity to fill in this gap with your content and product ideas. It will also generate research questions that will help bridge the gaps and generate new ideas. The template showcases the use of multiple n8n nodes and processes: enriching Google sheets file with the new data data extraction content enhancement using GraphRAG approach content gap / research question generation This approach can be very useful for research, marketing, and SEO applications as you can quickly get an overview of the main topics that are available online for a certain niche and understand what is missing. What are Content Gaps in Marketing and SEO? In the context of SEO, content gaps are usually understood as the topics that your competitors rank for but you do not. However, it's hard to rank for these topics because there's very high competition. So a much more effective way is to identify the gaps between the topics your competitors are talking about that are not yet bridged in their discourse. If you address these gaps in your content, you will increase the informational gain that your content offers and also offer a novel perspective while touching upon the topics that are relevant in your field. For example, if we analyze the top websites for "body and physical practices, fitness, etc." we will see that most of them are talking about the health and fitness aspects and another big topic is the community aspect. However, there is a gap between the two topics: which means that most of the websites (companies) that talk about this topic don't mention the two in the same context. This might be an opportunity: bridging the gap between health, fitness but also emphasizing the community aspect that comes with a collective practice. How it works This template consists of the two stages: 1) Data enrichment of a Google sheet file with a list of your competitors using InfraNodus' GraphRAG to generate topical summaries and graph summaries for every URL you're analyzing. 2) Insight generation (using InfraNodus to identify the main topical clusters and gaps in those summaries, these insights are then added to the Google sheet file. Additionally, it contains a sub workflow that you can activate and launch to ask Perplexity model to conduct a market research and find the companies that operate in your field and populate the original Google sheet file. Here's a description step by step: Step 0: Populate the Google sheets file with the company data (either manually or using the sub-workflow provided or Manus AI / Deep Research) Steps 1-2: Triggering and Launching the workflow, extracting the company URL from the Google sheet row Step 3: Scraping the url content from the companies' websites and cleaning the data Steps 5-7: Use InfraNodus GraphRAG Content Enhancer to get a topical summary and graph summary. This is what you're going to get: Steps 8-10: Use InfraNodus AI to generate insight advice and research questions based on the content gaps How to use You need an InfraNodus GraphRAG API account and key to use this workflow. Create an InfraNodus account Get the API key at https://infranodus.com/api-access and create a Bearer authorization key for the InfraNodus HTTP nodes. Create a separate knowledge graph for each expert (using PDF / content import options) in InfraNodus For each graph, go to the workflow, paste the name of the graph into the body name field. Keep other settings intact or learn more about them at the InfraNodus access points page. Once you add one or more graphs as experts to your flow, add the LLM key to the OpenAI node and launch the workflow Requirements An InfraNodus account and API key A Google Sheet account and an authorization key Note: OpenAI key is not required. But you might want to get a Perplexity AI key if you'd like to use the sub-workflow that populates the Google sheet with your competitors' website addresses (if you don't have this list yet). Customizing this workflow You can use this same workflow with a Telegram bot or Slack (to be notified of the summaries and ideas). You can also hook up automated social media content creation workflows in the end of this template, so you can generate posts that are relevant (covering the important topics in your niche) but also novel (because they connect them in a new way). Check out our n8n templates for ideas at https://n8n.io/creators/infranodus/ Check out the complete guide at https://support.noduslabs.com/hc/en-us/articles/20234254556828-Find-Content-Gaps-in-Websites-Market-Research-and-SEO-n8n-Workflow Also check the full tutorial with a conceptual explanation at https://support.noduslabs.com/hc/en-us/articles/20454382597916-Beat-Your-Competition-Target-Their-Content-Gaps-with-this-n8n-Automation-Workflow Also check out the video tutorial with a demo: For support and help with this workflow, please, contact us at https://support.noduslabs.com
by InfraNodus
This template can be used to upload the files in your Google drive to an InfraNodus knowledge graph. The InfraNodus graph will then reveal the main topics and ideas in your collection of documents and show the content gaps in them. You can also use the built-in AI to converse with the documents. You can also access the InfraNodus Graphs via its GraphRAG API to re-use them in your other n8n workflows for high-quality content retrieval and knowledge base optimization. The template showcases the use of multiple n8n nodes and processes: Extracting documents from a Google Drive folder text extraction optional: high-quality PDF conversion using ConvertAPI InfraNodus knowledge graph generation Note: If you want to **Sync your Google drive to an InfraNodus graph, check out our other workflow* How it works Here's a description of this workflow step by step: Find all the files in a specific Google drive folder For each file found: reiterate the workflow and Identify the type of the file (TXT, PDF, Markdown) For TXT and Markdown files extract the text data For PDF files use a special PDF to Text convertor to extract the text data. (Optional: using ConvertAPI for better quality PDF conversion) Forward everything to the InfraNodus graphAndStatements API endpoint with the name of the new graph, the text field with the text data, the text settings, and doNotSave=false to create a new graph Reiterate through another file. How to use You need an InfraNodus GraphRAG API account and key to use this workflow. Create an InfraNodus account Get the API key at https://infranodus.com/api-access and create a Bearer authorization key for the InfraNodus HTTP nodes. Use that API key to set up authorization for the InfraNodus tool in the workflow. If you want to upload the files to an existing graph, you should copy its name from InfraNodus. Otherwise you can specify any name you want. Requirements An InfraNodus account and API key A Google Drive account and authorization (you will need to set it up via Google Cloud using the n8n instructions provided in the Google Drive node). Customizing this workflow You can use Dropbox instead of Google Drive. You can also modify this workflow slightly to make it Sync with a Google Drive when the new files appear in it. Check out the complete guide at https://support.noduslabs.com/hc/en-us/articles/20267019838108-Upload-Sync-Your-Google-Drive-Folder-with-InfraNodus-using-n8n
by Ranjan Dailata
Notice Community nodes can only be installed on self-hosted instances of n8n. Who this is for The Brave Search Structured Data Extractor workflow is designed for professionals and teams that need high-quality, structured insights from Brave search results in real time. Whether you're performing market research, tracking competitors, training AI models, or powering content engines, this workflow offers a robust and automated solution. This workflow is tailored for: Market Researchers - Who analyze trends across multimedia channels AI Developers - Who require clean, structured datasets for model fine-tuning SEO & Content - Analysts looking to monitor visibility across news, images, and videos Media Researchers - Curating timely and relevant information across formats Automation Engineers - Integrating search insights into downstream workflows What problem is this workflow solving? Traditional web scraping and search result parsing is fragmented, inconsistent, and prone to errors, especially when dealing with multimedia (images, videos, news) data from search engines. This workflow provides: Centralized Brave search data extraction across all content types. Switches the search execution based upon the type of search that is being set. ex: news, images, videos, all Automated structured data transformation using Google Gemini Unified output persistence and notification across disk, webhook, and Google Sheets What this workflow does Input Configuration Define your Brave search query Set the search type: videos, images, news, or all Configure your Bright Data MCP zone Bright Data MCP Search Execution Initiates a Brave search via Bright Data MCP using the correct URL pattern for each search type Returns raw HTML of search results Google Gemini LLM Structured Data Extraction Transforms raw results into structured data (e.g., title, URL, source, snippet) Output Handling Save to disk (e.g., JSON or CSV file) Send Webhook notification with structured data (e.g., Slack, internal dashboards) Store in Google Sheets for team-wide access or dashboarding Pre-conditions Knowledge of Model Context Protocol (MCP) is highly essential. Please read this blog post - model-context-protocol You need to have the Bright Data account and do the necessary setup as mentioned in the Setup section below. You need to have the Google Gemini API Key. Visit Google AI Studio You need to install the Bright Data MCP Server @brightdata/mcp You need to install the n8n-nodes-mcp Setup Please make sure to setup n8n locally with MCP Servers by navigating to n8n-nodes-mcp Please make sure to install the Bright Data MCP Server @brightdata/mcp on your local machine. Sign up at Bright Data. Create a Web Unlocker proxy zone called mcp_unlocker on Bright Data control panel. Navigate to Proxies & Scraping and create a new Web Unlocker zone by selecting Web Unlocker API under Scraping Solutions. In n8n, configure the Google Gemini(PaLM) Api account with the Google Gemini API key (or access through Vertex AI or proxy). In n8n, configure the credentials to connect with MCP Client (STDIO) account with the Bright Data MCP Server as shown below. Make sure to copy the Bright Data API_TOKEN within the Environments textbox above as API_TOKEN=<your-token> How to customize this workflow to your needs Enhance Output Analysis Add additional LLM prompts for topic classification, sentiment scoring, or trend forecasting. Output Format Options Choose to output CSV, Markdown, or HTML reports based on your integration target. Schedule Automation Trigger the workflow on a schedule (daily/weekly) to keep monitoring topical content.
by InfraNodus
This template can be used to sync the files in your Google drive to a new or existing InfraNodus knowledge graph. The InfraNodus graph will then reveal the main topics and ideas in your collection of documents and show the content gaps in them. You can also use the built-in AI to converse with the documents. You can also access the InfraNodus Graphs via its GraphRAG API to re-use them in your other n8n workflows for high-quality content retrieval and knowledge base optimization. The template showcases the use of multiple n8n nodes and processes: Syncing documents from a Google Drive folder / extracting them text extraction from files optional: high-quality PDF conversion using ConvertAPI InfraNodus knowledge graph generation Note: If you want to **upload files from your Google drive to an InfraNodus graph, check out our other workflow* How it works Here's a description of this workflow step by step: Wait for new file(s) to appear in the Google drive folder Reiterate through each file Retrieve the new file from the Google drive For each file found: reiterate the workflow and Identify the type of the file (TXT, PDF, Markdown) For TXT and Markdown files extract the text data For PDF files use a special PDF to Text convertor to extract the text data. (Optional: using ConvertAPI for better quality PDF conversion) Forward everything to the InfraNodus graphAndStatements API endpoint with the name of the new graph, the text field with the text data, the text settings, and doNotSave=false to create a new graph Reiterate through another file. How to use You need an InfraNodus GraphRAG API account and key to use this workflow. Create an InfraNodus account Get the API key at https://infranodus.com/api-access and create a Bearer authorization key for the InfraNodus HTTP nodes. Use that API key to set up authorization for the InfraNodus tool in the workflow. If you want to upload the files to an existing graph, you should copy its name from InfraNodus. Otherwise you can specify any name you want. Requirements An InfraNodus account and API key A Google Drive account and authorization (you will need to set it up via Google Cloud using the n8n instructions provided in the Google Drive node). Customizing this workflow You can use Dropbox instead of Google Drive. You can also modify this workflow slightly to make it Upload the files from a Google Drive when the new files appear in it. Check out the complete guide at https://support.noduslabs.com/hc/en-us/articles/20267019838108-Upload-Sync-Your-Google-Drive-Folder-with-InfraNodus-using-n8n