Generate production database schemas from Excel and CSV with OpenAI and LangChain

Overview

This workflow automatically converts CSV or Excel files into a production-ready database schema using AI and rule-based validation.

It analyzes uploaded data, detects column types, relationships, and data quality, then generates a normalized schema. The output includes SQL DDL scripts, ERD diagrams, a data dictionary, and a load plan.

This eliminates manual schema design and accelerates database setup from raw data.

How It Works

File Upload (Webhook) Accepts CSV or XLSX files via webhook endpoint Initializes workflow configuration (thresholds, retry limits)

File Extraction Detects file format (CSV or Excel) Extracts rows into structured JSON Merges extracted datasets

Data Cleaning & Profiling Removes duplicates and normalizes values Detects data types (integer, float, date, boolean, string) Computes column statistics (nulls, uniqueness, distributions) Generates file hash and sample dataset

Column Profiling Engine Identifies potential primary keys Detects cardinality and uniqueness levels Suggests foreign key relationships based on value overlap

AI Schema Generation Uses an AI agent to design normalized tables Assigns SQL data types based on real data Defines primary keys, foreign keys, constraints, and indexes

Validation Layer Ensures schema matches actual data Validates: Data types Primary key uniqueness Foreign key overlap (>70%) Constraint consistency Detects circular dependencies

Revision Loop If validation fails: Sends feedback to AI agent Regenerates schema Retries up to configured limit

Schema Output Generation Generates: SQL DDL scripts ERD (Mermaid format) Data dictionary Load plan with dependency graph

Load Plan Engine Computes optimal table insertion order Detects circular dependencies Suggests batching strategy

Combine & Explain Merges all outputs Optional AI explanation of schema decisions

Response Output Returns structured JSON via webhook: SQL schema ERD summary Data dictionary Load plan Optional explanation

Setup Instructions

Activate the workflow and copy the webhook URL
Send a POST request with a CSV or XLSX file
Configure OpenAI credentials (used by AI agent)
Adjust thresholds if needed (FK overlap, retries, confidence)
Execute workflow and review generated outputs

Use Cases

Auto-generate database schema from CSV/Excel files
Data migration and onboarding pipelines
Rapid database prototyping
Reverse engineering datasets
AI-assisted data modeling

Requirements

n8n (latest version recommended)
OpenAI API credentials
LangChain nodes enabled
CSV or XLSX input file

0
Downloads
0
Views
8.44
Quality Score
beginner
Complexity
Author:Rajeet Nair(View Original →)
Created:3/29/2026
Updated:3/29/2026

🔒 Please log in to import templates to n8n and favorite templates

Workflow Visualization

Loading...

Preparing workflow renderer

Comments (0)

Login to post comments