by MANISH KUMAR
Shopify AI Automation Image-to-Product CSV Bulk Upload Automation This Shopify AI automation is an advanced n8n-powered workflow that converts raw product images into a Shopify-ready product CSV. It uses AI image analysis, Google Drive, Google Sheets, and Shopify APIs to fully automate product onboarding — from images to structured ecommerce data. Built for scalable ecommerce automation, this workflow is especially effective for image-first catalogs such as jewelry, fashion, and accessories. 🚀 Features 🖼️ AI Image Analysis — Analyzes product images one by one for higher accuracy and lower risk 🧠 Automatic Category Detection — Identifies main product category (e.g. Jewelry), easily customizable for any niche ✍️ AI Product Content Generation — Creates product names, descriptions (HTML), tags, and attributes 📄 Google Sheets Orchestration — Structures data and outputs a clean Shopify-compatible CSV 🛍️ Shopify Asset Upload — Uploads images to Shopify and retrieves CDN URLs 🧩 Workflow Preparation Before running the workflow: Upload all product images to Google Drive Name images using the format: <SKU><ColorCode> Example: 12345GR Place all images inside a folder named:<Brand Name> Root folder name : pending Example : Google_Drive/pending/Manish Collection/All Images Each image represents one product variant. ⚙️ How It Works The workflow follows a 6-step automation pipeline designed for reliability and scalability. Notes : You may connect all these step to make it fully automatic or shecdule it according to your suitable time. 🔄 Step-by-Step Process Step 1: Fetch Images from Google Drive Scans the pending/<brand_name> folder Fetches all images Extracts SKU and color code Stores references in Google Sheets Step 2: AI Image Analysis (One-by-One) Images are analyzed individually Slower than batch processing, but far more reliable Reduces hallucinations and incorrect attributes Ideal for production-grade Shopify automation. Step 3: Main Category Identification AI determines the primary product category (example: Jewelry) Prompts can be modified for any ecommerce niche Step 4: Conditional Product Content Generation Based on category: Product titles are generated Descriptions are written in Shopify-ready HTML Tags and attributes are created This replaces repetitive work typically handled via Shopify Flow or manual data entry. Step 5: Shopify Image Upload Images are uploaded to Shopify assets Shopify returns CDN URLs URLs are mapped back to product data Step 6: Shopify CSV Generation All enriched data is compiled into a new Google Sheet Output matches Shopify’s product import CSV format File is ready for bulk upload 🛠️ n8n Nodes Used Trigger Node (Manual / Schedule) Google Drive Node Google Sheets Node AI Agent Node (Image Analysis + Content) Switch Node (Category-based logic) Code Node (Formatting & CSV structure) Shopify Node / HTTP Node 🔐 Credentials Required Before running the workflow, configure the following credentials in n8n: Shopify Access Token** — For asset uploads and API calls AI Provider API Key** — For image analysis and content generation Google Drive OAuth** — To access product images Google Sheets OAuth** — To store and export data 👤 Ideal For This workflow is ideal for: Shopify store owners handling bulk product uploads Ecommerce teams managing image-heavy catalogs Agencies building scalable Shopify automation systems Anyone exploring how to automate Shopify product onboarding 💬 Extensibility This workflow is modular and easy to extend. You can add: Multi-language product descriptions Pricing and margin automation Shopify marketing automation triggers Shopify Flow integrations after product import Marketplace exports (Google Shopping, Meta, Amazon) 🔑 Keywords shopify ai shopify flow shopify marketing automation shopify automation ecommerce automation how to automate shopify 📌 Notes No AI fine-tuning required No fragile prompt chaining Designed for accuracy over speed Safe for production ecommerce workflows 📞 Support If you’re looking to customize or extend this workflow, feel free to reach out or fork the project. Happy automating 🚀
by Jay Emp0
AI-Powered Chart Generation from Web Data This n8n workflow automates the process of: Scraping real-time data from the web using GPT-4o with browsing capability Converting markdown tables into Chart.js-compatible JSON Rendering the chart using QuickChart.io Uploading the resulting image directly to your WordPress media library 🚀 Use Case Ideal for content creators, analysts, or automation engineers who need to: Automate generation of visual reports Create marketing-ready charts from live data Streamline research-to-publish workflows 🧠 How It Works 1. Prompt Input Trigger the workflow manually or via another workflow with a prompt string, e.g.: Generate a graph of apple's market share in the mobile phone market in Q1 2025 2. Web Search + Table Extraction The Message a model node uses GPT-4o with search to: Perform a real-time query Extract data into a markdown table Return the raw table + citation URLs 3. Chart Generation via AI Agent The Generate Chart AI Agent: Interprets the table Picks an appropriate chart type (bar, line, doughnut, etc.) Outputs valid Chart.js JSON using a strict schema 4. QuickChart API Integration The Create QuickChart node: Sends the Chart.js config to QuickChart.io Renders the chart into a PNG image 5. WordPress Image Upload The Upload image node: Uploads the PNG to your WordPress media library using REST API Uses proper headers for filename and content-type Returns the media GUID and full image URL 🧩 Nodes Used Manual Trigger or Execute Workflow Trigger OpenAI Chat Model (GPT-4o) LangChain Agent (Chart Generator) LangChain OutputParserStructured HTTP Request (QuickChart API + WordPress Upload) Code (Final result formatting) 🗂 Output Format The final Code node returns: { "research": { ...raw markdown table + citations... }, "graph_data": { ...Chart.js JSON... }, "graph_image": { ...WordPress upload metadata... }, "result_image_url": "https://your-wordpress.com/wp-content/uploads/...png" } ⚙️ Requirements OpenAI credentials (GPT-4o or GPT-4o-mini) WordPress REST API credentials with media write access QuickChart.io (free tier works) n8n v1.25+ recommended 📌 Notes Chart style and format are determined dynamically based on your table structure and AI interpretation. Make sure your OpenAI and WordPress credentials are connected properly. Outputs are schema-validated to ensure reliable rendering. 🖼 Sample Output
by Nijan
This workflow turns Slack into your content control hub and automates the full blog creation pipeline — from sourcing trending headlines, validating topics, drafting posts, and preparing content for your CMS. With one command in Slack, you can source news from RSS feeds, refine them with Gemini AI, generate high-quality blog posts, and get publish-ready output — all inside a single n8n workflow. ⸻ ⚙️ How It Works 1.Trigger in Slack Type start in a Slack channel to fetch trending headlines. Headlines are pulled from your configured RSS feeds. 2.Topic Generation (Gemini AI) Gemini rewrites RSS headlines into unique, non-duplicate topics. Slack displays these topics in a numbered list (e.g., reply with 2 to pick topic 2). 3.Content Validation When you reply with a number, Gemini validates and slightly rewrites the topic to ensure originality. Slack confirms the selected topic back to you. 4.Content Creation Gemini generates a LinkedIn/blog-style draft: Strong hook introduction 3–5 bullet insights A closing takeaway and CTA Optionally suggests asset ideas (e.g., image, infographic). 5.CMS-Ready Output Final draft is structured for publishing (markdown or plain text). You can expand this workflow to automatically send the output to your CMS (WordPress, Ghost, Notion, etc.). ⸻ 🛠 Setup Instructions Connect your Slack Bot to n8n. Configure your RSS Read nodes with feeds relevant to your niche. Add your Gemini API credentials in the AI node. Run the workflow: Type start in Slack → see trending topics. Reply with a number (e.g., gen 3) → get a generated blog draft in the same Slack thread. ⸻ 🎛 Customization Options • Change RSS sources to match your industry. • Adjust Gemini prompts for tone (educational, casual, professional). • Add moderation filters (skip sensitive or irrelevant topics). • Connect the final output step to your CMS, Notion, or Google Docs for publishing. ⸻ ✅ Why Use This Workflow? • One-stop flow: Sourcing → Validation → Writing → Publishing. • Hands-free control: Everything happens from Slack. • Flexible: Easily switch feeds, tone, or target CMS. • Scalable: Extend to newsletters, social posts, or knowledge bases.
by Dr. Christoph Schorsch
Rename Workflow Nodes with AI for Clarity This workflow automates the tedious process of renaming nodes in your n8n workflows. Instead of manually editing each node, it uses an AI language model to analyze its function and assign a concise, descriptive new name. This ensures your workflows are clean, readable, and easy to maintain. Who's it for? This template is perfect for n8n developers and power users who build complex workflows. If you often find yourself struggling to understand the purpose of different nodes at a glance or spend too much time manually renaming them for documentation, this tool will save you significant time and effort. How it works / What it does The workflow operates in a simple, automated sequence: Configure Suffix: A "Set" node at the beginning allows you to easily define the suffix that will be appended to the new workflow's name (e.g., "- new node names"). Fetch Workflow: It then fetches the JSON data of a specified n8n workflow using its ID. AI-Powered Renaming: The workflow's JSON is sent to an AI model (like Google Gemini or Anthropic Claude), which has been prompted to act as an n8n expert. The AI analyzes the type and parameters of each node to understand its function. Generate New Names: Based on this analysis, the AI proposes new, meaningful names and returns them in a structured JSON format. Update and Recreate: A Code Node processes these suggestions, updates all node names, and correctly rebuilds the connections and expressions. Create & Activate New Workflow: Finally, it creates a new workflow with the updated name, deactivates the original to avoid confusion, and activates the new version.
by Wessel Bulte
Description This workflow is a practical, “dirty” solution for real-world scenarios where frontline workers keep using Excel in their daily processes. Instead of forcing change, we take their spreadsheets as-is, clean and normalize the data, generate embeddings, and store everything in Supabase. The benefit: frontline staff continue with their familiar tools, while data analysts gain clean, structured, and vectorized data ready for analysis or RAG-style AI applications. How it works Frontline workers continue with Excel** – no disruption to their daily routines. Upload & trigger** – The workflow runs when a new Excel sheet is ready. Read Excel rows** – Data is pulled from the specified workbook and worksheet. Clean & normalize** – HTML is stripped, Excel dates are fixed, and text fields are standardized. Batch & switch** – Rows are split and routed into Question/Answer processing paths. Generate embeddings** – Cleaned Questions and Answers are converted into vectors via OpenAI. Merge enriched records** – Original business data is combined with embeddings. Write into Supabase** – Data lands in a structured table (excel_records) with vector and FTS indexes. Why it’s “dirty but useful” No disruption** – frontline workers don’t need to change how they work. Analyst-ready data** – Supabase holds clean, queryable data for dashboards, reporting, or AI pipelines. Bridge between old and new** – Excel remains the input, but the backend becomes modern and scalable. Incremental modernization** – paves the way for future workflow upgrades without blocking current work. Outcome Frontline workers keep their Excel-based workflows, while data can immediately be structured, searchable, and vectorized in Supabase — enabling AI-powered search, reporting, and retrieval-augmented generation. Required setup Supabase account Create a project and enable the pgvector extension. OpenAI API Key Required for generating embeddings (text-embedding-3-small). Microsoft Excel credentials Needed to connect to your workbook and worksheet. Need Help 🔗 LinkedIn – Wessel Bulte
by Roshan Ramani
Product Video Creator with Nano Banana & Veo 3.1 via Telegram Who's it for This workflow is perfect for: E-commerce sellers needing quick product videos Social media marketers creating content at scale Small business owners without video editing skills Product photographers enhancing their offerings Anyone selling on Instagram, TikTok, or mobile-first platforms What it does Transform basic product photos into professional marketing videos in under 2 minutes: Send a product photo to your Telegram bot Nano Banana analyzes and enhances your image with studio-quality lighting Veo 3.1 generates an 8-second vertical video with motion and audio Receive your scroll-stopping marketing video automatically Perfect for creating engaging vertical content without expensive tools or editing expertise. How it works Input → User sends product photo via Telegram with optional caption AI Analysis → Nano Banana analyzes product and generates detailed enhancement prompt Image Enhancement → Nano Banana creates commercial-grade photo (9:16, studio lighting) Video Generation → Veo 3.1 creates 8-second 1080p video with motion and audio Delivery → Auto-polls status every 30s, delivers final video to Telegram Requirements Google Cloud Platform Vertex AI API** enabled for Veo 3.1 Generative Language API** enabled for Nano Banana OAuth2 credentials Get credentials from Google Cloud Console Telegram Bot token from @BotFather n8n Self-hosted or cloud instance Setup Import workflow JSON into n8n Add credentials: Telegram API (bot token) Google OAuth2 API (client id and secret) Google PaLM API (API key) Update your Project ID in both Veo 3.1 nodes Activate workflow and test with a product photo How to customize Aspect Ratio: Choose 9:16 (vertical), 16:9 (horizontal) in "Generate Enhanced Image" and "Initiate veo 3.1" nodes Duration: Set 2 to 8 seconds by adjusting durationSeconds in "Initiate veo 3.1 Video Generation" Quality: Select 720p or 1080p by changing resolution in "Initiate veo 3.1 Video Generation" Audio: Enable or disable background music by toggling generateAudio in "Initiate veo 3.1 Video Generation" Enhancement Style: Match your brand aesthetic by editing the prompt in "AI Design Analysis" node Polling Time: Adjust retry interval by changing wait time in "Processing Delay (30s)" node Key Features 🔐 Direct Google APIs – No third-party services. Uses Nano Banana and Veo 3.1 directly via Google Cloud for maximum reliability and privacy ⚡ Fully Automated – Send photo, receive video. Zero manual work required 🎨 Studio Quality – Nano Banana delivers professional lighting, composition, and AI-powered color grading 📱 Mobile-First – Default 9:16 vertical format optimized for Instagram Reels, TikTok, and Stories 🔄 Smart Retry Logic – Automatically polls Veo 3.1 status every 30 seconds until video generation completes 🎵 Audio Included – Veo 3.1 generates background music automatically (can be disabled)
by Deniz
Structured Setup Guide: Narrative Chaining with N8N + AI 1. Input Setup Use a Google Sheet as the control panel. Fields required: Video URL (starting clip, ends with .mp4) Number of clips to extend (e.g., 2 extra scenes) Aspect ratio (horizontal, vertical, etc.) Model (V3 or V3 Fast) Narrative theme (guidance for story flow) Special requests (scene-by-scene instructions) Status column (e.g., "For Production", "Done") 👉 Example scene inputs: Scene 1: Naruto walks out with ramen is his hands Scene 2: Joker joins with chips 2. Workflow in N8N Step 1: Fetch Input Get rows in sheet → fetch the next row where status = For Production. Clear sheet 2 → reset the sheet that stores generated scenes. Edit fields (Initial Values): Video URL = starting clip Step = 1 Complete = total number of scenes requested Step 2: Looping Logic Looper Node: Runs until step = complete. Carries over current video URL → feeds into next generation. Step 3: Analyze Current Clip Send video URL to File.AI Video Understanding API. Request: Describe last frame + audio + scene details. Output: Detailed video analysis text. Step 4: Generate Prompt AI Agent creates the next scene prompt using: Context from video analysis Narrative theme (from sheet) Scene instructions (from sheet) Aspect ratio, model preference, etc. 👉 Output = video prompt for next scene Step 5: Extract Last Frame Call File.AI Extract Frame API. Parameters: Input video URL Frame = last Output = JPG image (last frame of current clip). Step 6: Generate New Scene Use Key.AI (V3 Fast) for economical video generation. POST request includes: Prompt (from AI Agent) Aspect ratio + model Image URL (last frame) → ensures seamless chaining Wait for generation to complete. 👉 Output = New clip URL (MP4) Step 7: Store & Increment Log new clip URL into Sheet 2. Increment Step by +1. Replace Video URL with the new clip. Loop back if Step < Complete. 3. Output Section Once all clips are generated: Gather all scene URLs from Sheet 2. Use File.AI Merge Videos API to stitch clips together: Original clip + all generated scenes. Save final MP4 output. Update Sheet 1 row with: Final video URL Status = Done 4. Costs Video analysis: ~$0.015 per 8s clip Frame extraction: ~0.002¢ (almost free) Clip merging: negligible (via ffmpeg backend) V3 Fast video generation (Key.AI): ~$0.30 per 8s clip
by Jimleuk
Cohere's new multimodal model releases make building your own Vision RAG agents a breeze. If you're new to Multimodal RAG and for the intent of this template, it means to embed and retrieve only document scans relevant to a query and then have a vision model read those scans to answer. The benefits being (1) the vision model doesn't need to keep all document scans in context (expensive) and (2) ability to query on graphical content such as charts, graphs and tables. How it works Page extracts from a technology report containing graphs and charts are downloaded, converted to base64 and embedded using Cohere's Embed v4 model. This produces embedding vectors which we will associate with the original page url and store them in our Qdrant vector store collection using the Qdrant community node. Our Vision RAG agent is split into 2 parts; one regular AI agent for chat and a second Q&A agent powered by Cohere's Command-A-vision model which is required to read contents of images. When a query requires access to the technology report, the Q&A agent branch is activated. This branch performs a vector search on our image embeddings and returns a list of matching image urls. These urls are then used as input for our vision model along with the user's original query. The Q&A vision agent can then reply to the user using the "respond to chat" node. Because both agents share the same memory space, it would be the same conversation to the user. How to use Ensure you have a Cohere account and sufficient credit to avoid rate limit or token usage restrictions. For embeddings, swap out the page extracts for your own. You may need to split and convert document pages to images if you want to use image embeddings. For chat, you may want to structure the agent(s) in another way which makes sense for your environment eg. using MCP servers. Requirements Cohere account for Embeddings and LLM Qdrant for vector store
by Aryan Shinde
Effortlessly generate, review, and publish SEO-optimized blog posts to WordPress using AI and automation. How It Works AI Topic Generation: Gemini suggests trending blog topics matching your agency's services. Content Research: Tavily fetches recent relevant articles for each generated topic. Human Review: Choose the preferred article for publishing through a Telegram notification. AI Rewriting: Gemini rewrites the selected article into a polished, SEO-friendly post. Image Generation & Publishing: The workflow creates a featured image with Gemini or OpenAI, then publishes the post (with dynamic categories and images) to WordPress. Audit Trail: Every published post is logged to Google Sheets, and final details are sent to Telegram. Set Up Steps Estimated setup time: 15–30 minutes (excluding API approval/wait times). Connect your WordPress, Gemini (Google), Tavily, Google Sheets, and Telegram accounts. Configure your preferred posting schedule in the “Schedule Trigger.” Adjust prompts or messages to fit your agency’s niche or editorial voice if needed. Note: Detailed customizations and advanced configuration tips are included in the sticky notes within the workflow.
by vinci-king-01
Multi-Source RAG System with GPT-4 Turbo, News & Academic Papers Integration This workflow provides an enterprise-grade RAG (Retrieval-Augmented Generation) system that intelligently searches multiple sources and generates AI-powered responses using GPT-4 Turbo. How it works This workflow provides an enterprise-grade RAG (Retrieval-Augmented Generation) system that intelligently searches multiple sources and generates AI-powered responses using GPT-4 Turbo. Key Steps Form Input - Collects user queries with customizable search scope, response style, and language preferences Intelligent Search - Routes queries to appropriate sources (web, academic papers, news, internal documents) Data Aggregation - Unifies and processes information from multiple sources with quality scoring AI Processing - Uses GPT-4 Turbo to generate context-aware, source-grounded responses Response Enhancement - Formats outputs in various styles (comprehensive, concise, technical, etc.) Multi-Channel Delivery - Delivers results via webhook, email, Slack, and optional PDF generation Data Sources & AI Models Search Sources Web Search**: Google, Bing, DuckDuckGo integration Academic Papers**: arXiv, PubMed, Google Scholar News Articles**: News API, RSS feeds, real-time news Technical Documentation**: GitHub, Stack Overflow, documentation sites Internal Knowledge**: Google Drive, Confluence, Notion integration AI Models GPT-4 Turbo**: Primary language model for response generation Embedding Models**: For semantic search and similarity matching Custom Prompts**: Specialized prompts for different response styles Set up steps Setup time: 15-20 minutes Configure API credentials - Set up OpenAI API, ScrapeGraphAI, Google Drive, and other service credentials Set up search sources - Configure academic databases, news APIs, and internal knowledge sources Connect analytics - Link Google Sheets for usage tracking and performance monitoring Configure notifications - Set up Slack channels and email templates for automated alerts Test the workflow - Run sample queries to verify all components are working correctly Keep detailed configuration notes in sticky notes inside your workflow
by Davide
This workflow automates the process of creating short videos from multiple image references (up to 7 images). It uses "Vidu Reference to Video" model, a video generation API to transform a user-provided prompt and image set into a consistent, AI-generated video. This workflow automates the process of generating AI-powered videos from a set of reference images and then uploading them to TikTok and Youtube. The process is initiated via a user-friendly web form. Advantages ✅ Consistent Video Creation: Uses multiple reference images to maintain subject consistency across frames. ✅ Easy Input: Just a simple form with prompt + image URLs. ✅ Automation: No manual waiting—workflow checks status until video is ready. ✅ SEO Optimization: Automatically generates a catchy, optimized YouTube title using AI. ✅ Multi-Platform Publishing: Uploads directly to Google Drive, YouTube, and TikTok in one flow. ✅ Time Saving: Removes repetitive tasks of video generation, download, and manual uploading. ✅ Scalable: Can run periodically or on-demand, perfect for content creators and marketing teams. ✅ UGC & Social Media Ready: Designed for creating viral short videos optimized for platforms like TikTok and YouTube Shorts. How It Works Form Trigger: A user submits a web form with two key pieces of information: a text Prompt describing the desired video and a list of Reference images (URLs separated by commas or new lines). Data Processing: The workflow processes the submitted image URLs, converting them from a text string into a proper array format for the AI API. AI Video Generation: The processed data (prompt and image array) is sent to the Fal.ai VIDU API endpoint (reference-to-video) to start the video generation job. This node returns a request_id. Status Polling: The workflow enters a loop where it periodically checks the status of the generation job using the request_id. It waits for 60 seconds and then checks if the status is "COMPLETED". If not, it waits and checks again. Result Retrieval: Once the video is ready, the workflow fetches the URL of the generated video file. Title Generation: Simultaneously, the original user prompt is sent to an AI model (GPT-4o-mini via OpenRouter) to generate an optimized, engaging title for the social media post. Upload & Distribution: The video file is downloaded from the generated URL. A copy is saved to a specified Google Drive folder for storage. The video, along with the AI-generated title, is automatically uploaded to YouTube and TikTok via the Upload-Post.com API service. Set Up Steps This workflow requires configuration and API keys from three external services to function correctly. Step 1: Configure Fal.ai for Video Generation Create an account and obtain your API key. In the "Create Video" HTTP node, edit the "Header Auth" credentials. Set the following values: Name: Authorization Value: Key YOUR_FAL_API_KEY (replace YOUR_FAL_API_KEY with your actual key) Step 2: Configure Upload-Post.com for Social Media Uploads Get an API key from your Upload-Post Manage Api Keys dashboard (10 free uploads per month). In both the "HTTP Request" (YouTube) and "Upload on TikTok" nodes, edit their "Header Auth" credentials. Set the following values: Name: Authorization Value: Apikey YOUR_UPLOAD_POST_API_KEY (replace YOUR_UPLOAD_POST_API_KEY with your actual key) Crucial: In the body parameters of both upload nodes, find the user field and replace YOUR_USERNAME with the exact name of the social media profile you configured on Upload-Post.com (e.g., my_youtube_channel). Step 3: Configure Google Drive (Optional Storage) The "Upload Video" node is pre-configured to save the video to a Google Drive folder named "Fal.run". Ensure your Google Drive credentials in n8n are valid and that you have access to this folder, or change the folderId parameter to your desired destination. Step 4: Configure AI for Title Generation The "Generate title" node uses OpenAI to access the gpt-5-mini model.. Need help customizing? Contact me for consulting and support or add me on Linkedin.
by Mauricio Perera
📁 Analyze uploaded images, videos, audio, and documents with specialized tools — powered by a lightweight language-only agent. 🧭 What It Does This workflow enables multimodal file analysis using Google Gemini tools connected to a text-only LLM agent. Users can upload images, videos, audio files, or documents via a chat interface. The workflow will: Upload each file to Google Gemini and obtain an accessible URL. Dynamically generate contextual prompts based on the file(s) and user message. Allow the agent to invoke Gemini tools for specific media types as needed. Return a concise, helpful response based on the analysis. 🚀 Use Cases Customer support**: Let users upload screenshots, documents, or recordings and get helpful insights or summaries. Multimedia QA**: Review visual, audio, or video content for correctness or compliance. Educational agents**: Interpret content from PDFs, diagrams, or audio recordings on the fly. Low-cost multimodal assistants: Achieve multimodal functionality **without relying on large vision-language models. 🎯 Why This Architecture Matters Unlike end-to-end multimodal LLMs (like Gemini 1.5 or GPT-4o), this template: Uses a text-only LLM (Qwen 32B via Groq) for reasoning. Delegates media analysis to specialized Gemini tools. ✅ Advantages | Feature | Benefit | | ----------------------- | --------------------------------------------------------------------- | | 🧩 Modular | LLM + Tools are decoupled; can update them independently | | 💸 Cost-Efficient | No need to pay for full multimodal models; only use tools when needed | | 🔧 Tool-based Reasoning | Agent invokes tools on demand, just like OpenAI’s Toolformer setup | | ⚡ Fast | Groq LLMs offer ultra-fast responses with low latency | | 📚 Memory | Includes context buffer for multi-turn chats (15 messages) | 🧪 How It Works 🔹 Input via Chat Users submit a message and (optionally) files via the chatTrigger. 🔹 File Handling If no files: prompt is passed directly to the agent. If files are included: Files are split, uploaded to Gemini (to get public URLs). Metadata (name, type, URL) is collected and embedded into the prompt. 🔹 Prompt Construction A new chatInput is dynamically generated: User message Media: [array of file data] 🔹 Agent Reasoning The Langchain Agent receives: The enriched prompt File URLs Memory context (15 turns) Access to 4 Gemini tools: IMG: analyze image VIDEO: analyze video AUDIO: analyze audio DOCUMENT: analyze document The agent autonomously decides whether and how to use tools, then responds with concise output. 🧱 Nodes & Services | Category | Node / Tool | Purpose | | --------------- | ---------------------------- | ------------------------------------- | | Chat Input | chatTrigger | User interface with file support | | File Processing | splitOut, splitInBatches | Process each uploaded file | | Upload | googleGemini | Uploads each file to Gemini, gets URL | | Metadata | set, aggregate | Builds structured file info | | AI Agent | Langchain Agent | Receives context + file data | | Tools | googleGeminiTool | Analyze media with Gemini | | LLM | lmChatGroq (Qwen 32B) | Text reasoning, high-speed | | Memory | memoryBufferWindow | Maintains session context | ⚙️ Setup Instructions 1. 🔑 Required Credentials Groq API key** (for Qwen 32B model) Google Gemini API key** (Palm / Gemini 1.5 tools) 2. 🧩 Nodes That Need Setup Replace existing credentials on: Upload a file Each GeminiTool (IMG, VIDEO, AUDIO, DOCUMENT) lmChatGroq 3. ⚠️ File Size & Format Considerations Some Gemini tools have file size or format restrictions. You may add validation nodes before uploading if needed. 🛠️ Optional Improvements Add logging and error handling (e.g., for upload failures). Add MIME-type filtering to choose the right tool explicitly. Extend to include OCR or transcription services pre-analysis. Integrate with Slack, Telegram, or WhatsApp for chat delivery. 🧪 Example Use Case > "Hola, ¿qué dice este PDF?" Uploads a document → Agent routes it to Gemini DOCUMENT tool → Receives extracted content → LLM summarizes it in Spanish. 🧰 Tags multimodal, agent, langchain, groq, gemini, image analysis, audio analysis, document parsing, video analysis, file uploader, chat assistant, LLM tools, memory, AI tools 📂 Files This template is ready to use as-is in n8n. No external webhooks or integrations required.