Narrating over a Video using Multimodal AI

Name: Narrating over a Video using Multimodal AI
Availability: InStock
Rating: 0.4 (1 reviews)
Author: Jimleuk

This n8n template takes a video and extracts frames from it which are used with a multimodal LLM to generate a script. The script is then passed to the same multimodal LLM to generate a voiceover clip.

This template was inspired by Processing and narrating a video with GPT's visual capabilities and the TTS API

How it works Video is downloaded using the HTTP node. Python code node is used to extract the frames using OpenCV. Loop node is used o batch the frames for the LLM to generate partial scripts. All partial scripts are combined to form the full script which is then sent to OpenAI to generate audio from it. The finished voiceover clip is uploaded to Google Drive.

Sample the finished product here: https://drive.google.com/file/d/1-XCoii0leGB2MffBMPpCZoxboVyeyeIX/view?usp=sharing

Requirements

OpenAI for LLM Ideally, a mid-range (16GB RAM) machine for acceptable performance!

Customising this workflow

For larger videos, consider splitting into smaller clips for better performance Use a multimodal LLM which supports fully video such as Google's Gemini.

0

Downloads

8842

Views

8.74

Quality Score

intermediate

Complexity

Category:Content Management

Author:Jimleuk(View Original →)

Created:8/14/2025

Updated:11/17/2025

Related Templates

Track Demo Bookings with Google Calendar to Meta Conversions API Integration

Who is this workflow for? If you're using Meta Ads to generate new leads to your sales pipeline, this workflow is for yo...

Content Management1 downloads

Transcribe & Summarize Audio with Whisper and GPT, from Google Drive to Notion

This workflow contains community nodes that are only compatible with the self-hosted version of n8n. Overview This work...

Content Management1 downloads

Reusable and Independently Testable Sub-workflow

Reusable and Independently Testable Sub-workflow This n8n workflow provides a standardized structure for building and te...

Narrating over a Video using Multimodal AI

Tags

Related Templates

Track Demo Bookings with Google Calendar to Meta Conversions API Integration

Transcribe & Summarize Audio with Whisper and GPT, from Google Drive to Notion

Reusable and Independently Testable Sub-workflow

Workflow Visualization

Loading...

Comments (0)