Generate production database schemas from Excel and CSV with OpenAI and LangChain
Overview
This workflow automatically converts CSV or Excel files into a production-ready database schema using AI and rule-based validation.
It analyzes uploaded data, detects column types, relationships, and data quality, then generates a normalized schema. The output includes SQL DDL scripts, ERD diagrams, a data dictionary, and a load plan.
This eliminates manual schema design and accelerates database setup from raw data.
How It Works
File Upload (Webhook) Accepts CSV or XLSX files via webhook endpoint Initializes workflow configuration (thresholds, retry limits)
File Extraction Detects file format (CSV or Excel) Extracts rows into structured JSON Merges extracted datasets
Data Cleaning & Profiling Removes duplicates and normalizes values Detects data types (integer, float, date, boolean, string) Computes column statistics (nulls, uniqueness, distributions) Generates file hash and sample dataset
Column Profiling Engine Identifies potential primary keys Detects cardinality and uniqueness levels Suggests foreign key relationships based on value overlap
AI Schema Generation Uses an AI agent to design normalized tables Assigns SQL data types based on real data Defines primary keys, foreign keys, constraints, and indexes
Validation Layer Ensures schema matches actual data Validates: Data types Primary key uniqueness Foreign key overlap (>70%) Constraint consistency Detects circular dependencies
Revision Loop If validation fails: Sends feedback to AI agent Regenerates schema Retries up to configured limit
Schema Output Generation Generates: SQL DDL scripts ERD (Mermaid format) Data dictionary Load plan with dependency graph
Load Plan Engine Computes optimal table insertion order Detects circular dependencies Suggests batching strategy
Combine & Explain Merges all outputs Optional AI explanation of schema decisions
Response Output Returns structured JSON via webhook: SQL schema ERD summary Data dictionary Load plan Optional explanation
Setup Instructions
Activate the workflow and copy the webhook URL
Send a POST request with a CSV or XLSX file
Configure OpenAI credentials (used by AI agent)
Adjust thresholds if needed (FK overlap, retries, confidence)
Execute workflow and review generated outputs
Use Cases
Auto-generate database schema from CSV/Excel files
Data migration and onboarding pipelines
Rapid database prototyping
Reverse engineering datasets
AI-assisted data modeling
Requirements
n8n (latest version recommended)
OpenAI API credentials
LangChain nodes enabled
CSV or XLSX input file
Related Templates
Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search
Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...
USDT And TRC20 Wallet Tracker API Workflow for n8n
Overview This n8n workflow is specifically designed to monitor USDT TRC20 transactions within a specified wallet. It u...
Add product ideas to Google Sheets via a Slack
Use Case This workflow is a slight variation of a workflow we're using at n8n. In most companies, employees have a lot o...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments