Evaluation metric example: String similarity
AI evaluation in n8n
This is a template for n8n's evaluation feature.
Evaluation is a technique for getting confidence that your AI workflow performs reliably, by running a test dataset containing different inputs through the workflow.
By calculating a metric (score) for each input, you can see where the workflow is performing well and where it isn't.
How it works
This template shows how to calculate a workflow evaluation metric: text similarity, measured character-by-character.
The workflow takes images of hand-written codes, extracts the code and compares it with the expected answer from the dataset.
The images look like this:
The workflow works as follows:
We use an evaluation trigger to read in our dataset It is wired up in parallel with the regular trigger so that the workflow can be started from either one. More info We download the image and use AI to extract the code If we’re evaluating (i.e. the execution started from the evaluation trigger), we calculate the string distance metric We pass this information back to n8n as a metric
Related Templates
AI SEO Readability Audit: Check Website Friendliness for LLMs
Who is this for? This workflow is designed for SEO specialists, content creators, marketers, and website developers who ...
Reply to Outlook Emails with OpenAI
Who is this template for? This template is for any Microsoft Outlook user who wants a trained AI agent to reason and rep...
Use OpenRouter in n8n versions <1.78
What it is: In version 1.78, n8n introduced a dedicated node to use the OpenRouter service, which lets you to use a lot...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments