Beginner AI Dataset Generator using OpenAI + LangChain in n8n

This n8n workflow dynamically generates a realistic sample dataset based on a single topic you provide. It uses OpenAI (via LangChain) and n8n’s built-in nodes to:

Generate structured JSON data for 5 columns with 3–5 values each
Flatten that data into a single text blob
Infer meaningful column names via a second AI call
Pivot, split, merge, and rename columns automatically
Output a clean, labeled dataset ready for export or further processing

⚙️ Prerequisites

OpenAI API Key
Visit: https://platform.openai.com/account/api-keys
Create a new key
In n8n: Credentials → New → OpenAI API, paste key, name it “OpenAi account”

LangChain nodes enabled in your n8n instance

🥇 Step 1: Set Up OpenAI Credential Go to OpenAI API Keys
Create and copy your key
In n8n: Credentials → New → OpenAI API → paste key as “OpenAi account”

🥈 Step 2: Manual Trigger Add Manual Trigger to start the workflow

🥉 Step 3: Set Topic Add a Set node named Set Topic to Search
Field: Topic = n8n use cases (or any topic you choose)

✨ Step 4: Generate Structured Data LangChain Agent** node Generate Random Data Connect to OpenAI Chat Model1 and Tool: Inject Creativity1
System prompt: instruct AI to output 5 columns of realistic values in JSON

🔧 Step 5: Parse AI Output Structured Output Parser** to validate JSON

🔄 Step 6: Flatten Data Code** node Outpt all Data to One Field
Joins all values into a comma-separated string for column naming

🧠 Step 7: Generate Column Names LangChain Agent** Generate Column Names
Connect to OpenAI Chat Model2
Prompt: infer 5 column names from the string

🔢 Step 8: Pivot Names Row Code** node Pivot Column Names transforms array into { column1: name1, … }

🪓 Step 9: Split Columns 5 SplitOut nodes to break each array back into rows per column

🔗 Step 10: Merge Rows Merge** node Merge Columns together using combineByPosition

🏷️ Step 11: Rename Columns Set** node Rename Columns assigns the AI-generated names to each column

🔗 Step 12: Final Output Merge** Append Column Names combines data and header row

🏁 Done! You now have a fully AI-driven, labeled dataset generated from a single topic—no external services needed. Easily extend by adding a Google Sheets or HTTP node to export.

📬 Need Help or Want to Customize This? 📧 robert@ynteractive.com
🔗 LinkedIn

0
Downloads
16
Views
8.44
Quality Score
intermediate
Complexity
Author:Robert Breen(View Original →)
Created:8/13/2025
Updated:8/25/2025

🔒 Please log in to import templates to n8n and favorite templates

Workflow Visualization

Loading...

Preparing workflow renderer

Comments (0)

Login to post comments