Nano Banana/Gemini 2.5 Telegram Bot with Multi-modal Functionality

How it works Multi-modal AI Image Generator powered by Google's Nano Banana (Gemini 2.5 Flash Image) - the latest state-of-the-art image generation model Accepts text, images, voice messages, and PDFs via Telegram for maximum flexibility Uses OpenAI GPT models for conversation and image analysis, then Nano Banana for stunning image generation Features conversation memory for iterative image modifications ("make it darker", "change to blue") Processes different input types: analyzes uploaded images, transcribes voice messages, extracts PDF text All inputs are converted to optimized prompts specifically tuned for Nano Banana's capabilities

Set up steps Create Telegram bot via @BotFather and get API token Set up Google Gemini API key from Google AI Studio for Nano Banana image generation (~$0.04/image) Configure OpenAI API key for GPT models (conversation, image analysis, voice transcription) Import workflow and configure all three API credentials in n8n Update bot tokens in HTTP request nodes for file downloads Test with text prompts, image uploads, voice messages, and PDF documents

0
Downloads
0
Views
8.51
Quality Score
intermediate
Complexity
Created:9/22/2025
Updated:11/20/2025

🔒 Please log in to import templates to n8n and favorite templates

Workflow Visualization

Loading...

Preparing workflow renderer

Comments (0)

Login to post comments