Nano Banana/Gemini 2.5 Telegram Bot with Multi-modal Functionality
How it works Multi-modal AI Image Generator powered by Google's Nano Banana (Gemini 2.5 Flash Image) - the latest state-of-the-art image generation model Accepts text, images, voice messages, and PDFs via Telegram for maximum flexibility Uses OpenAI GPT models for conversation and image analysis, then Nano Banana for stunning image generation Features conversation memory for iterative image modifications ("make it darker", "change to blue") Processes different input types: analyzes uploaded images, transcribes voice messages, extracts PDF text All inputs are converted to optimized prompts specifically tuned for Nano Banana's capabilities
Set up steps Create Telegram bot via @BotFather and get API token Set up Google Gemini API key from Google AI Studio for Nano Banana image generation (~$0.04/image) Configure OpenAI API key for GPT models (conversation, image analysis, voice transcription) Import workflow and configure all three API credentials in n8n Update bot tokens in HTTP request nodes for file downloads Test with text prompts, image uploads, voice messages, and PDF documents
Related Templates
Extract Named Entities from Web Pages with Google Natural Language API
Who is this for? Content strategists analyzing web page semantic content SEO professionals conducting entity-based anal...
Add product ideas to Notion via a Slack command
Use Case In most companies, employees have a lot of great ideas. That was the same for us at n8n. We wanted to make it a...
Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search
Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments