Beginner AI Dataset Generator using OpenAI + LangChain in n8n
This n8n workflow dynamically generates a realistic sample dataset based on a single topic you provide. It uses OpenAI (via LangChain) and n8n’s built-in nodes to:
Generate structured JSON data for 5 columns with 3–5 values each
Flatten that data into a single text blob
Infer meaningful column names via a second AI call
Pivot, split, merge, and rename columns automatically
Output a clean, labeled dataset ready for export or further processing
⚙️ Prerequisites
OpenAI API Key
Visit: https://platform.openai.com/account/api-keys
Create a new key
In n8n: Credentials → New → OpenAI API, paste key, name it “OpenAi account”
LangChain nodes enabled in your n8n instance
🥇 Step 1: Set Up OpenAI Credential
Go to OpenAI API Keys
Create and copy your key
In n8n: Credentials → New → OpenAI API → paste key as “OpenAi account”
🥈 Step 2: Manual Trigger Add Manual Trigger to start the workflow
🥉 Step 3: Set Topic
Add a Set node named Set Topic to Search
Field: Topic = n8n use cases (or any topic you choose)
✨ Step 4: Generate Structured Data
LangChain Agent** node Generate Random Data
Connect to OpenAI Chat Model1 and Tool: Inject Creativity1
System prompt: instruct AI to output 5 columns of realistic values in JSON
🔧 Step 5: Parse AI Output Structured Output Parser** to validate JSON
🔄 Step 6: Flatten Data
Code** node Outpt all Data to One Field
Joins all values into a comma-separated string for column naming
🧠 Step 7: Generate Column Names
LangChain Agent** Generate Column Names
Connect to OpenAI Chat Model2
Prompt: infer 5 column names from the string
🔢 Step 8: Pivot Names Row Code** node Pivot Column Names transforms array into { column1: name1, … }
🪓 Step 9: Split Columns 5 SplitOut nodes to break each array back into rows per column
🔗 Step 10: Merge Rows Merge** node Merge Columns together using combineByPosition
🏷️ Step 11: Rename Columns Set** node Rename Columns assigns the AI-generated names to each column
🔗 Step 12: Final Output Merge** Append Column Names combines data and header row
🏁 Done! You now have a fully AI-driven, labeled dataset generated from a single topic—no external services needed. Easily extend by adding a Google Sheets or HTTP node to export.
📬 Need Help or Want to Customize This?
📧 robert@ynteractive.com
🔗 LinkedIn
Related Templates
Automate Free IP Analysis: NixGuard AI Summaries & Wazuh Integration
Supercharge Your Security Operations for Free Stop wasting time manually investigating suspicious IP addresses. This wo...
AI Agent with Ollama for current weather and wiki
This workflow template demonstrates how to create an AI-powered agent that provides users with current weather informati...
Automate Daily YouTrack Task Summaries to Discord by Assignee
Daily YouTrack In-Progress Tasks Summary to Discord by Assignee Keep your team in sync with a daily summary of tasks cu...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments