LLM Evaluations & Reinforcement Learning for Shopify Sidekick on Rails

LLM Evaluations & Reinforcement Learning for Shopify Sidekick on Rails
Andrew Mcnamara and Charlie Lee • Amsterdam, Netherlands • Talk

Date: September 05, 2025
Published: not published
Announced: Tue, 20 May 2025 00:00:00 +0000

This talk explores building production LLM systems through Shopify Sidekick's Rails architecture, covering orchestration patterns and tool integration strategies. We'll establish statistically rigorous LLM-based evaluation frameworks that move beyond subjective 'vibe testing.' Finally, we'll demonstrate how robust evaluation systems become critical infrastructure for reinforcement learning pipelines, while exploring how RL can learn to hack evaluations and strategies to mitigate this.

Rails World 2025

Explore all talks scheduled for Rails World 2025
Jean Boussier
OKURA Masafumi
Austin Story
Jack Sharkey
Adrianna Chang
David Heinemeier Hansson
Katarina Rossi
Chris Oliver
+18