Vectorize Medical Procedures for Semantic Search with TUSS, Gemini & pgVector
Description
This workflow vectorizes the TUSS (Terminologia Unificada da Saúde Suplementar) table by transforming medical procedures into vector embeddings ready for semantic search.
It automates the import of TUSS data, performs text preprocessing, and uses Google Gemini to generate vector embeddings. The resulting vectors can be stored in a vector database, such as PostgreSQL with pgvector, enabling efficient semantic queries across healthcare data.
What Problem Does This Solve? Searching for medical procedures using traditional keyword matching is often imprecise. This workflow enhances the search experience by enabling semantic similarity search, which can retrieve more relevant results based on the meaning of the query instead of exact word matches.
How It Works
Import TUSS data: Load medical procedure entries from the TUSS table. Preprocess text: Clean and prepare the text for embedding. Generate embeddings: Use Google Gemini to convert each procedure into a semantic vector. Store vectors: Save the output in a PostgreSQL database with the pgvector extension.
Prerequisites
An n8n instance (self-hosted). A PostgreSQL database with the pgvector extension enabled. Access to the Google Gemini API. TUSS data in a structured format (CSV, database, or API source).
Customization Tips You can adapt the preprocessing logic to your own language or domain-specific terms.
Swap Google Gemini with another embedding model, such as OpenAI or Cohere.
Adjust the chunking logic to control the granularity of semantic representation.
Setup Instructions Prepare a source (database or CSV) with TUSS data. You need at least two fields:
CD_ITEM (Medical procedure code)
DS_ITEM (Medical procedure description)
Configure your Oracle or PostgreSQL database credentials in the Credentials section of n8n.
Make sure your PostgreSQL database has pgVector installed.
Replace the placeholder table and column names with your actual TUSS table.
Connect your Google Gemini credentials (via OpenAI proxy or official connector).
Run the workflow to vectorize all medical procedure descriptions.
Related Templates
Extract Title tag and Meta description from url for SEO analysis with Airtable
Extract Title tag and meta description from url for SEO analysis. How it works The workflows takes records from Airtabl...
Restore your workflows from GitHub
This workflow restores all n8n instance workflows from GitHub backups using the n8n API node. It complements the Backup ...
Extract Named Entities from Web Pages with Google Natural Language API
Who is this for? Content strategists analyzing web page semantic content SEO professionals conducting entity-based anal...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments