Benchmark LLM Performance on Legal Documents with Google Sheets and OpenRouter
This workflow demonstrates a simple way to run evals on a set of test cases stored in a Google Sheet.
The example we are using comes from an info extraction task dataset, where we tested 6 different LLMs on 18 different test cases.
This workflow extends the functionality of my simple eval for benchmarking legal tasks here.
Rather than running executions sequentially (waiting for each one to respond before making another request), we use parallel processing to fire 2 requests every second.
You can see our sample data in this spreadsheet here to get started.
Once you have this working for our dataset, you can plug in your own test cases matching different LLMs to see how it works with your own data.
How it works Pull our test cases from Google Sheets. For each case, fire off an HTTP request to a webhook. That webhook grabs the relevant source file from Google Drive and converts it to text. The text gets sent to an LLM via Open Router (so we can easily swap out models). Results come back and are logged in Google Sheets.
Set up steps: Add your credentials for Google Sheets, Google Drive, and OpenRouter. Make a copy of the original data spreadsheet so that you can edit it yourself. You will need to plug your version in the Update Results node to see the spreadsheet update on each run of the loop.
Related Templates
Convert Tour PDFs to Vector Database using Google Drive, LangChain & OpenAI
🧩 Workflow: Process Tour PDF from Google Drive to Pinecone Vector DB with OpenAI Embeddings Overview This workflow au...
Provide latest euro exchange rates from European Central Bank via Webhook
What is this workflow doing? This simple workflow is pulling the latest Euro foreign exchange reference rates from the E...
Automate Daily Keyword Research with Google Sheets, Suggest API & Custom Search
Who's it for This workflow is perfect for SEO specialists, marketers, bloggers, and content creators who want to automa...
🔒 Please log in to import templates to n8n and favorite templates
Workflow Visualization
Loading...
Preparing workflow renderer
Comments (0)
Login to post comments