Evaluations Metric: Answer Similarity
This n8n template demonstrates how to calculate the evaluation metric "Similarity" which in this scenario, measures the consistency of the agent.
The scoring approach is adapted from the open-source evaluations project RAGAS and you can see the source here https://github.com/explodinggradients/ragas/blob/main/ragas/src/ragas/metrics/_answer_similarity.py
How it works This evaluation works best where questions are close-ended or about facts where the answer can have little to no deviation. For our scoring, we generate embeddings for both the AI's response and ground truth and calculate the cosine similarity between them. A high score indicates LLM consistency with expected results whereas a low score could signal model hallucination.
Requirements n8n version 1.94+ Check out this Google Sheet for a sample data https://docs.google.com/spreadsheets/d/1YOnu2JJjlxd787AuYcg-wKbkjyjyZFgASYVV0jsij5Y/edit?usp=sharing
Related Templates
Get Airtable data via AI and Obsidian Notes
I am submitting this workflow for the Obsidian community to showcase the potential of integrating Obsidian with n8n. Whi...
Task Deadline Reminders with Google Sheets, ChatGPT, and Gmail
Intro This template is for project managers, team leads, or anyone who wants to automatically remind teammates of tasks ...
Use OpenRouter in n8n versions <1.78
What it is: In version 1.78, n8n introduced a dedicated node to use the OpenRouter service, which lets you to use a lot...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments