Summarized using AI

Chime Presents: Getting More Out of LLMs for Ruby Developers

Justin Wienckowski and Jake Gribschaw • July 09, 2025 • Philadelphia, PA • Talk

Introduction

This lightning talk, delivered at RailsConf 2025 by Justin Wienckowski and Jake Gribschaw from Chime, focuses on how Ruby developers can effectively leverage Large Language Model (LLM) tools to improve productivity, code quality, and developer satisfaction. The speakers share lessons from Chime's AI tool adoption journey, practical advice for getting started, and insights from building an AI code review bot.

Key Points

  • The Power of AI Coding Agents for Rails Developers:

    • AI coding agents offer powerful assistance by automating repetitive tasks, aiding in understanding codebases, and letting developers focus more on design, problem decomposition, and code review.
    • The Rails ecosystem is particularly well-suited for AI tool adoption due to its open-source culture, comprehensive tutorials, and consistent coding patterns.
  • Choosing and Setting Up AI Tools:

    • Developers should not overthink tool selection, as the ecosystem is rapidly evolving. Hands-on experimentation is recommended.
    • Chime's preferred tool is Cursor, a VS Code fork, which requires setup with Ruby/Rails-specific extensions and proper configuration, particularly for extensions like Ruby LSP.
    • Practical troubleshooting tips and configuration guidance are discussed to help developers get started smoothly.
  • Using Coding Agents Effectively:

    • Overview of Cursor’s agent capabilities, including modes for discussion and code editing, context management, slash commands, and integrating with project files.
    • Prompt engineering is critical: developers should iterate on prompts and build a “spellbook” of reusable prompts and permanent agent rules.
    • Different workflows exist on a spectrum from simple tab completion through pair programming with the agent, to fully autonomous code generation for well-defined tasks.
    • Choice of workflow should depend on task complexity and the need for human judgment versus automation.
  • Lessons from Building Chime’s AI Code Review Bot (Beacon):

    • Intent is to reduce reviewer workload by prompting developers for clear intent and surfacing stylistic and quality suggestions beyond traditional linters like Rubocop.
    • Beacon helps developers learn and iterate independently, producing more robust contributions.
    • Key insight: LLMs, like human brains, respond best to focused, precise prompts. Excessive or imprecise context can dilute the quality of the generated responses.
    • For cutting-edge or niche technologies (e.g., Rails 8), supplement LLMs with Retrieval Augmented Generation (RAG) by injecting up-to-date documentation.
    • Evaluate whether advanced approaches (like RAG or fine-tuning) are necessary, or if simpler methods will suffice based on your goals.
  • Sustainable AI Integration:

    • Avoid chasing every trending tool; focus on sustainable progress, thoughtful integration, and avoiding vendor lock-in by abstracting AI interfaces and capturing relevant data for reproducibility and analysis.
    • Iterative development and careful prompt/version tracking improve both technical outcomes and learning.

Conclusions and Takeaways

  • AI tools are transforming software development, especially for Ruby and Rails developers.
  • Success requires practical adoption strategies, effective prompt engineering, and matching workflows to the right level of agent autonomy.
  • Supplement LLMs' limitations with targeted context when facing novel technologies, but avoid unnecessary complexity.
  • Sustainable, iterative adoption, careful data management, and focus on enhancement over replacement maximize both productivity and learning.

Chime Presents: Getting More Out of LLMs for Ruby Developers
Justin Wienckowski and Jake Gribschaw • Philadelphia, PA • Talk

Date: July 09, 2025
Published: July 23, 2025
Announced: unknown

We're all hearing about LLM tools that are changing on a daily, if not hourly, basis. How can you navigate these tools as a developer to improve your code and your impact.

Justin Wienckowski will share lessons and advice from Chime's programs piloting tools for developer use, and Jake Gribschaw will talk about lessons learned building an AI code review bot.

RailsConf 2025

00:00:16.720 Nice to be here everyone. My name is
00:00:18.080 Justin Wowski. I'm a senior software
00:00:19.760 engineer at Chime. Uh and this is going
00:00:22.720 to be a turbo mode lightning talk. This
00:00:25.840 talk is a compressed version of the
00:00:28.560 training that we have started to offer
00:00:30.480 to our Rails engineers to learn how to
00:00:32.320 use AI coding tools. So by the end of
00:00:34.719 this talk, you'll know any everything
00:00:36.000 that you need to get started with AI
00:00:37.440 coding and take that knowledge back to
00:00:39.680 your teams hopefully and hope and help
00:00:41.760 encourage adoption if this is a tool
00:00:43.760 that you think will help you. So what
00:00:46.399 are we why are we doing this? Uh we
00:00:48.640 believe that AI coding agents are a new
00:00:50.239 tool that can make our lives better. Uh
00:00:52.559 at its core, programming is about
00:00:54.480 juggling a lot of detailed specific
00:00:56.480 knowledge.
00:00:58.000 AI assisted coding is like working with
00:01:00.079 someone who has memorized the internet.
00:01:03.199 This can be incredibly powerful, but of
00:01:06.000 course, like any tool, it has strengths
00:01:08.080 and weaknesses.
00:01:10.320 Rails development plays to the strengths
00:01:12.240 of AI. The Rails community has a great
00:01:15.040 culture of sharing. We have good quality
00:01:17.600 open- source tools and code. We have
00:01:20.080 lots of tutorials and documentation and
00:01:22.720 we have this strong culture around using
00:01:24.560 common patterns and all of these things
00:01:27.200 are things that AI can help us leverage.
00:01:30.240 AI is also great at automating the
00:01:32.400 boring stuff. This can actually make our
00:01:35.040 jobs more fun as programmers. So it's
00:01:37.680 not just about productivity. It can
00:01:39.759 actually be about our enjoyment and our
00:01:42.560 satisfaction and our growth in learning.
00:01:44.400 Also,
00:01:47.920 agents reduce the cost of writing code.
00:01:51.840 This is their real superpower, as I hope
00:01:54.159 you'll see.
00:01:56.640 Reducing the cost of writing code
00:01:58.159 changes how we work. It lets us focus
00:02:01.200 more time and effort on the things that
00:02:02.880 we can do better than computers like
00:02:06.719 decomposing tasks into small safe steps,
00:02:10.160 object-oriented design, innovating to
00:02:13.200 solve domain specific problems in
00:02:16.000 evaluating the trade-offs in different
00:02:17.599 implementations
00:02:19.120 and reviewing code and improving code
00:02:21.200 quality.
00:02:24.160 So to get started with AI coding, the
00:02:26.080 first thing you need to do is to select
00:02:27.680 a tool. Don't spend too much time on
00:02:29.840 this. Don't overthink it. This is the
00:02:31.599 gold rush era of tools right now. There
00:02:33.760 is a ton of them. They're evolving
00:02:35.840 constantly. Keeping track of this stuff
00:02:38.080 is a full-time job by itself, and it's
00:02:40.800 probably not worth your time. Experience
00:02:44.000 is going to be better than research. A
00:02:46.160 day or two of experimenting with a tool
00:02:49.360 will teach you everything you need to
00:02:50.879 know. Now, tools are also improving
00:02:53.760 quickly. So even a tool that wasn't very
00:02:56.560 good a number of months ago might be
00:02:58.879 reasonable today. Don't be afraid to
00:03:01.040 experiment and don't be afraid to repeat
00:03:04.000 your evaluations in the future.
00:03:07.120 This is a list of the AI coding tools
00:03:09.040 that we currently use at Chime. Cursor
00:03:11.920 is our most widely used tool and so I'm
00:03:14.080 going to quickly show you how to get set
00:03:15.840 up and make Cursor a reasonable
00:03:18.000 environment for Rails development.
00:03:20.879 Let's get started.
00:03:23.040 First, you got to download Cursor. Sign
00:03:24.800 up for an account. Now, Cursor is a fork
00:03:27.360 of VS Code. Now, a lot of Rails
00:03:30.239 developers are rightly so hesitant about
00:03:32.959 using a VS Code environment for for Ruby
00:03:35.280 and Rails development because for a long
00:03:37.360 time it was really awful. Um, the
00:03:40.560 general sentiment at Chime is that this
00:03:42.640 is still not as good as Ruby and Rails
00:03:45.280 specific idees, but it's very
00:03:47.440 serviceable now. It you can actually be
00:03:49.680 productive with it. So once you get
00:03:52.080 installed, first you have to install
00:03:53.599 extensions. You can open the user
00:03:56.159 interface. You click here. You click
00:03:58.799 here. And then you can search the
00:04:00.720 marketplace for extensions.
00:04:03.200 Our friends at Shopify have an excellent
00:04:05.200 Ruby extension which is actually an
00:04:06.799 extension pack that bundles a number of
00:04:09.360 extensions together along with some
00:04:10.959 custom configuration. It is, we believe,
00:04:13.920 the best way to get started and it's
00:04:15.519 super easy.
00:04:18.320 You also might want to install some of
00:04:19.840 these extensions. They can be helpful.
00:04:22.400 One note is that you do not need a a
00:04:24.479 separate rubocop extension. And in fact,
00:04:26.560 installing multiple of them can create
00:04:28.960 conflicts and odd behavior. The Ruby LSP
00:04:31.600 extension actually integrates with
00:04:33.680 Rubocop directly. So, it's usually not
00:04:35.520 necessary.
00:04:38.639 So, even though cursor is much improved,
00:04:41.280 sometimes it's going to need some
00:04:42.320 troubleshooting. So if things aren't
00:04:44.400 working right, if things seem weird,
00:04:46.560 first click here and open the bottom
00:04:49.360 panel, click the output tab, and then
00:04:52.400 use the selector to look at different
00:04:54.479 log files. This is the first uh the
00:04:57.360 first opportunity for troubleshooting
00:04:58.960 and debugging. Now, some of these errors
00:05:01.759 are going to like this one that Ruby LSP
00:05:04.080 is showing don't necessarily affect the
00:05:06.400 functionality, so you don't necessarily
00:05:08.400 have to fix it. But if things are weird
00:05:10.800 in the UI or things like code navigation
00:05:13.039 aren't working, this is your first stop.
00:05:16.800 Now remember that Ruby LSP is a separate
00:05:19.199 external process. That means that if it
00:05:21.680 it if it encounters errors, it'll output
00:05:24.080 logs into a separate directory in your
00:05:25.919 project route. So also check here.
00:05:29.039 Now once all the extensions are working
00:05:30.560 and your environment is reasonably
00:05:32.320 stable, the next step is to customize
00:05:34.320 your settings.
00:05:36.720 uh VS code and cursor by extension has
00:05:39.039 multiple levels of settings. The easiest
00:05:40.880 way to access it is the searchable
00:05:42.800 command pallet. This allows you to open
00:05:44.880 this either you can open the JSON
00:05:46.960 settings file directly which is great.
00:05:49.360 It has great autocomplete. It's really
00:05:50.800 easy to use or there's a guey interface.
00:05:53.840 Also, one of the most common problems
00:05:56.639 that people run into with cursor and
00:05:58.479 Ruby LSP is your version manager. So
00:06:01.840 Ruby LSP will attempt to autodetect your
00:06:04.080 version manager and use it
00:06:05.520 automatically. But if you have multiple
00:06:07.759 installed, it might not guess correctly.
00:06:10.400 So you can override it with this setting
00:06:11.919 here.
00:06:14.720 Another troubleshooting tip is to start
00:06:16.639 cursor from the command line within your
00:06:18.800 Rails project. This will help it ensure
00:06:21.199 that it loads the environment correctly
00:06:23.120 and eliminates a lot of problems.
00:06:26.080 Okay, so that's the cursor specific
00:06:28.000 part. Let's talk about now. Oh, this is
00:06:30.479 not excuse me. Let's talk about how to
00:06:32.639 use cursor's agent. Now, a lot of this
00:06:34.960 is specific to cursor, but some of it
00:06:36.800 can be generalized.
00:06:38.560 So, to open cursors agent, you click
00:06:40.560 here to open the right- hand panel.
00:06:43.759 The agent has two main modes, ask and
00:06:46.319 agent. Ask is just for discussion,
00:06:49.440 investigation,
00:06:50.960 searching,
00:06:52.479 uh discussing ideas. You'll likely spend
00:06:55.360 most of your time in agent mode. Now,
00:06:57.520 agent mode is pretty eager about writing
00:06:59.599 and editing code, but any code change
00:07:02.319 that it makes will be highlighted in the
00:07:04.160 editor, and you have to manually accept
00:07:06.560 or reject each change.
00:07:09.039 The agent's pretty smart, and it can
00:07:10.880 also run tests. It can run external
00:07:12.639 tools, and it will by default prompt you
00:07:15.440 before doing anything like that. So,
00:07:17.680 it's a great way to get started.
00:07:20.720 Another trick is that there's some slash
00:07:22.400 commands available that allow you to
00:07:24.479 quickly get to common functions.
00:07:27.360 in cursor. Uh a lot of these are related
00:07:29.680 to context. So context is primarily
00:07:32.639 about uh directing the agent to the
00:07:35.360 source code that it should work on and
00:07:37.039 pay attention to. So the agent can read
00:07:40.319 all of the code in your project. But
00:07:43.039 this is how you get it to focus on the
00:07:44.880 task at hand.
00:07:47.599 You can add files and folders to the
00:07:49.440 agents context through this menu here.
00:07:52.400 And you can also use the at symbol to
00:07:54.800 mention specific files in the prompt
00:07:56.800 itself. You can also drag files from
00:07:59.360 tabs or from the explorer view into the
00:08:01.520 prompt.
00:08:03.520 So that's a basic introduction to
00:08:05.199 cursor. It's really pretty easy to use
00:08:08.000 and it's come a long way in the last few
00:08:09.919 years. So now let's talk more generally
00:08:13.599 about how you can actually use coding
00:08:15.199 agents to get work done.
00:08:18.240 This section applies to all coding
00:08:19.599 agents, not just cursor. Different
00:08:21.520 agents have different strengths and
00:08:23.440 weaknesses, but we're finding that a lot
00:08:25.680 of their overall features and
00:08:26.879 intelligence is converging pretty
00:08:28.400 quickly. We're often seeing multiple
00:08:30.720 updates and releases from these various
00:08:32.880 different tools every week.
00:08:35.919 Like any new programming tool, coding
00:08:37.599 agents require us to build some new
00:08:39.279 skills. One of those skills is prompt
00:08:41.599 engineering.
00:08:43.360 Now, prompt writing is not a science.
00:08:46.000 It's more like magic. You have to create
00:08:48.800 the right setting, invoke the right
00:08:51.360 spirits, and find the right incantation.
00:08:54.640 And you're not going to get it right the
00:08:56.080 first time.
00:08:57.839 Uh, you really want to iterate and learn
00:09:00.240 as you as you go.
00:09:02.800 As you learn how to effectively prompt
00:09:04.560 the agent and get it to do what you
00:09:06.399 want, you're building a spellbook.
00:09:10.080 And this is not just in a general sense.
00:09:12.480 when uh there's going to be a lot of
00:09:14.240 tasks just like most programming where
00:09:16.640 we do the same tasks over and over again
00:09:19.120 and you're going to find yourself
00:09:20.399 repeating prompts. So you can actually
00:09:23.279 build up composable and reusable prompts
00:09:27.120 and so this spellbook is a valuable new
00:09:29.120 tool for us.
00:09:31.600 Now most agents support some kind of
00:09:33.440 rules which are essentially permanent
00:09:36.080 prompts. Cursor's feature is called
00:09:38.560 rules for AI. Um, and clawed code
00:09:41.920 supports the uh will automatically read
00:09:44.080 the claw.md file if it exists.
00:09:48.160 Uh, so these features allow you to give
00:09:50.560 your agent permanent instructions and
00:09:52.720 it's a great way to provide context and
00:09:54.720 information to provide your style guide
00:09:56.640 to the agent because remember that
00:09:58.399 agents are really good at understanding
00:10:00.959 plain English. So this doesn't
00:10:03.600 necessarily have to be specific
00:10:05.200 instructions. It can be more general.
00:10:09.120 um agent rules.
00:10:14.320 So those are the basic tools to use an
00:10:17.680 agent. But how can you use them
00:10:19.120 effectively? So one thing that's that
00:10:21.519 everybody's thinking about these days is
00:10:22.959 coding workflows with agents. And this
00:10:25.200 is a huge area of experimentation.
00:10:27.920 But there are some ideas that you can
00:10:29.360 use as landmarks to help you find a path
00:10:31.360 and get started.
00:10:33.279 One way to think about coding workflows
00:10:35.120 is this spectrum of agent agency or
00:10:37.519 autonomy or independence. Uh how
00:10:40.640 independently is the agent acting? Um
00:10:43.360 our friends at ThoughtBot have a great
00:10:44.800 blog post talks about this in terms of
00:10:46.800 human-led programming versus agentled
00:10:49.440 programming.
00:10:51.600 The simplest form is tab completion. Um
00:10:54.399 sometimes people uh knock on this or
00:10:57.120 kind of skip over it, but I think it can
00:10:59.040 actually be really valuable and it can
00:11:00.399 be a perfectly fine place to start. you
00:11:02.560 know, especially if you're writing
00:11:03.680 boilerplate code or glue code. This can
00:11:06.560 actually save you a lot of keystrokes.
00:11:10.079 Another option is pair programming.
00:11:12.560 Agents are probably going to be the most
00:11:14.640 convenient pair that most of us have an
00:11:17.120 opportunity to work with. It's a great
00:11:19.839 way to discuss design ideas, try out
00:11:22.399 different implementations,
00:11:24.240 and this is where you'll start to save a
00:11:25.920 lot of time by having the agent write
00:11:27.680 the code for you. The trade-off, of
00:11:30.240 course, is that you have to review their
00:11:31.519 work.
00:11:33.120 So, the pairing approach is great for
00:11:34.880 things like scaffolding and glue code,
00:11:37.519 repetitive and boring refactorings, or
00:11:40.079 trying different design options, trying
00:11:41.600 different implementation approaches.
00:11:45.440 So, when you're pairing, you're going to
00:11:47.279 be having this continuous conversation,
00:11:49.040 this back and forth with the agent.
00:11:51.920 What if you want the agent to work more
00:11:53.279 independently? So one option is to
00:11:56.000 automate what you would do. So take a
00:11:59.279 task, write a detailed plan, and let the
00:12:02.480 agent execute it. You're automating your
00:12:04.880 own workflow using the agent.
00:12:08.240 Now you can develop this workflow with
00:12:10.399 the agent's help. You don't have to do
00:12:11.760 it all on your own, but the important
00:12:13.200 part is that your judgment is what's
00:12:15.440 going into structuring this task.
00:12:18.880 So when you're doing this, the more uh
00:12:21.440 independence that you give the agent,
00:12:23.279 you do have to think about guard rails,
00:12:25.440 checkpoints. How do you make sure that
00:12:27.360 you are supervising the agents work? And
00:12:30.079 that's one thing that that prompt
00:12:31.519 spellbook will help you build up.
00:12:34.800 Agents are starting to incorporate
00:12:36.320 planning features. Um, but I still like
00:12:38.320 to write my plan down in files. Um, this
00:12:40.720 approach is uh adapted from uh one of
00:12:43.360 Harper Reed's blog posts.
00:12:45.600 uh you can develop a plan in a file uh
00:12:48.240 create a to-do list so that the agent
00:12:50.000 can track its own progress and this is
00:12:52.480 very useful because sometimes agents get
00:12:54.639 stuck in loops or sometimes agents uh
00:12:58.079 when they start to compact the context
00:12:59.839 window begin to generate odd results and
00:13:02.720 you basically have to do you have to go
00:13:05.120 back to the Windows 95 era and reboot.
00:13:08.240 So these files give you some state so
00:13:10.880 that the agent can pick up where it left
00:13:12.560 off.
00:13:14.880 There's another benefit to this too,
00:13:16.320 which is that by storing your thought
00:13:18.880 process in these files, you can begin to
00:13:21.600 start to recognize the common pieces and
00:13:24.320 identify prompts that are reusable or
00:13:26.160 that could be generalized.
00:13:28.800 Uh some people using cloud code find
00:13:30.639 that fewer files or even a single file
00:13:32.800 works better. So this is also an area to
00:13:34.959 experiment on.
00:13:37.200 A fourth approach gives the agent
00:13:38.800 maximum independence. Uh, and you can
00:13:41.440 even have agents work on multiple tasks
00:13:43.600 in the background. Now, so this is where
00:13:46.000 you treat the agent like a separate
00:13:47.360 programmer where you just assign it a
00:13:48.880 task and it goes off and it works and it
00:13:52.320 comes back with maybe a completed branch
00:13:54.160 or even a pull request.
00:13:56.480 And it only is going to bug you if it
00:13:58.320 thinks it's stuck or if you notice that
00:14:00.399 it's stuck.
00:14:02.800 So, this is an exciting idea. Uh, but
00:14:05.279 there's a lot of details that go into
00:14:06.880 making it work effectively. And this is
00:14:08.560 one of the areas that's really in rapid
00:14:10.399 development right now. This is best for
00:14:12.880 things where you have a very specific
00:14:14.639 scope, a clear completion criteria.
00:14:18.320 Um, and remember that agents are best at
00:14:22.000 essentially generating the code that is
00:14:23.680 the average of of the entire internet.
00:14:26.399 So this is best for simple problems with
00:14:29.120 clear solutions.
00:14:31.920 This mode is what Kentbeck refers to as
00:14:33.839 genie mode because it'll do its best to
00:14:36.639 fulfill what you your wish, it often
00:14:39.600 does so imperfectly until you learn to
00:14:42.720 express your wish in just the right way.
00:14:48.560 Now, even though we've gone through
00:14:50.480 these kind of in order, these are not
00:14:52.560 steps on a path. These are options that
00:14:55.600 you can pick for the task at hand.
00:14:58.240 So you might choose the pairing approach
00:14:59.839 for designing explorations,
00:15:02.240 scaffolding and simple things,
00:15:03.839 double-checking your test or writing
00:15:05.839 tests or writing an implementation based
00:15:08.320 on a test. Uh pairing mode is also great
00:15:11.440 for learning.
00:15:13.760 Using a detailed plan is good for
00:15:15.600 larger, more complex tasks where the
00:15:17.600 human element is important, where you're
00:15:20.160 innovating solutions to unique or domain
00:15:22.480 specific problems, and where you need to
00:15:25.120 bring your knowledge to bear.
00:15:28.160 And the coding genie can be great for
00:15:30.079 automating relatively simple tasks,
00:15:32.320 small to medium complexity that follow
00:15:35.040 standard patterns and have standard
00:15:36.639 solutions.
00:15:40.160 And as always, we're at the very
00:15:42.240 beginning of this experiment. Uh, and I
00:15:44.720 hope that this will be an exciting way
00:15:46.800 to change the uh to explore different
00:15:49.199 ways of working and to make programming
00:15:51.440 fun in new ways.
00:15:55.279 So, pick a tool, almost any tool. Build
00:15:58.240 your prompt spellbook, start with
00:16:00.240 pairing, get to know your agent,
00:16:03.040 automate the boring stuff, make your
00:16:04.800 life easier,
00:16:06.560 and experiment with workflows.
00:16:09.920 There's some resources here. Don't worry
00:16:11.440 about copying these down. We'll post
00:16:12.639 these slides in in the Slack channel.
00:16:17.519 And that is my lightning talk on AI
00:16:20.240 coding.
00:16:28.959 All right, thanks everyone. Uh, my name
00:16:31.360 is Jake Grishaw. I've been a Chime
00:16:33.680 engineer now for about six years and
00:16:36.160 interested in machine learning and AI
00:16:38.480 for around eight. Uh, today I'm going to
00:16:41.120 talk to you a little bit about some of
00:16:42.399 the valuable lessons I've learned when
00:16:44.079 coding up Beacon, which is Chime's AI
00:16:46.639 powered code review bot. Uh just before
00:16:49.040 we jump in though, I'd like to quickly
00:16:50.720 share a little bit of my philosophy
00:16:51.920 around uh integrating AI into our
00:16:54.000 software development life cycle. So I
00:16:57.040 believe that you know AI is truly
00:16:59.519 transformative for us. It's an enhancing
00:17:01.600 power and embracing AI tooling means
00:17:04.319 actively cultivating a relationship
00:17:06.000 between us and it leveraging AI's
00:17:09.360 efficiency, knowledge, and scalability
00:17:12.079 without letting it supplant human
00:17:13.839 insight, creativity, and wisdom. When AI
00:17:16.880 fully integr automates our tasks, it
00:17:19.120 erodess our skills, dulling our lives
00:17:21.760 and ultimately reducing the quality of
00:17:23.360 our work. We don't merely want to press
00:17:26.000 run when prompted.
00:17:28.240 Uh instead we want to thoughtfully
00:17:30.240 integrate AI into our workflow using it
00:17:32.720 to amplify our capabilities and freeing
00:17:35.039 ourselves to tackle the complex
00:17:36.400 challenges that truly require human
00:17:37.919 knowledge and judgment like when to
00:17:40.080 include uh use includes versus joins and
00:17:43.280 active record and that lovely N plus one
00:17:45.360 that we talked about earlier.
00:17:48.799 So why did we build beacon in the first
00:17:51.440 place? Today, code is being written
00:17:53.840 faster than ever, meaning maintainers
00:17:55.760 are facing pressure to review, provide
00:17:58.320 feedback, and decision on incoming
00:18:00.240 requests. With Beacon, we saw an
00:18:02.480 opportunity to shift some of that burden
00:18:04.240 off of maintainers by prompting
00:18:06.400 developers to clearly articulate their
00:18:08.240 intentions and educate them on common
00:18:10.160 pitfalls.
00:18:12.400 Uh asking if they have things like
00:18:14.720 consider different approaches and um
00:18:19.440 random tools. uh developers can iterate
00:18:22.240 commonly uh independently and present
00:18:24.320 more polished and robust products to
00:18:26.480 these maintainers. We like to think of
00:18:29.120 it as a less deterministic Rubikop. Its
00:18:31.840 suggestions are able to go beyond
00:18:33.280 finding just issues but also look for
00:18:35.600 style existing patterns in our codebase.
00:18:38.080 And like Rubicop, sometimes the
00:18:40.000 suggestions aren't appropriate for the
00:18:42.400 situation. It gives our developers
00:18:44.640 confidence that they're on the right
00:18:45.840 track though and informs them when
00:18:47.760 they're using outdated practices.
00:18:51.600 In the end, we're able to build a
00:18:53.280 chimeified version of our review bot
00:18:55.440 that our engineers felt outperformed
00:18:57.440 commercial applications. But I'm here to
00:19:00.799 talk mostly about the key lessons we
00:19:02.960 learned and creating an AI agent. So,
00:19:06.080 let's take a little bit step back,
00:19:08.000 right?
00:19:09.600 Uh I'd like to uh have you imagine for a
00:19:12.480 moment a pink elephant.
00:19:15.440 No. Uh hopefully you started picturing
00:19:17.679 some sort of a brightly colored
00:19:19.039 packaderm. Uh but perhaps other ideas
00:19:21.520 started popping up. Things like
00:19:23.039 childhood memories, uh Dumbo from Disney
00:19:25.520 or a joke. Perhaps even something
00:19:27.840 completely unrelated like Balders's
00:19:30.320 Gate. Why does this happen?
00:19:33.679 Well, it's because our brains, much like
00:19:36.400 LLM, encode information in a rich
00:19:38.799 multi-dimensional space, right? We kind
00:19:41.200 of think about uh a graph uh or cone in
00:19:45.840 this dimensional space when we add in
00:19:48.080 prompting. So mentioning an idea
00:19:50.480 suddenly activates many related ideas
00:19:53.120 shaping our perceptions, thoughts and
00:19:55.840 ultimately our actions.
00:19:58.880 Similarly, guide an LLM. We can guide an
00:20:01.600 LLM by creating these contextual
00:20:03.440 shortcuts. For instance, if you identify
00:20:06.160 that an LM was likely trained on a
00:20:09.039 certain common resource, you can
00:20:11.120 reference that concept in that resource
00:20:13.679 by including recognizable fingerprints
00:20:16.000 like specific URLs or key phrases in
00:20:18.799 your prompt.
00:20:21.280 By embedding these identifiers into your
00:20:23.120 prompts, you've effectively shaped the
00:20:25.760 way that the LM will say thinks about it
00:20:29.120 subconsciously, right? nudging it
00:20:31.600 towards desired outputs without actually
00:20:33.919 excessively increasing the PL prompt
00:20:36.080 complexity. This helps with a bit of
00:20:38.640 complication that we often run into,
00:20:41.840 which is the more context isn't always
00:20:44.640 better.
00:20:46.159 Early on in building Beacon, we assumed
00:20:48.480 providing lots of background information
00:20:50.720 would lead to richer insights, but we
00:20:52.960 quickly learned that wasn't really true.
00:20:55.600 In fact, huge prompt instructions or
00:20:57.600 overly broad context often dilutes the
00:21:00.559 clarity of a request, expanding the
00:21:03.120 potential outcomes into a confusing
00:21:05.120 array of possibilities.
00:21:08.159 The more precise your request, the more
00:21:10.159 precise and actionable the responses
00:21:11.679 will be. So, one of the more valuable
00:21:14.080 lessons we learned was to keep the scope
00:21:16.000 of each interaction, aka an LLM call,
00:21:19.760 uh, very narrow and clear.
00:21:22.400 Critical checks like security reviews,
00:21:24.799 database changes often require their own
00:21:27.760 dedicated request to produce uh
00:21:30.960 actionable feedback.
00:21:33.840 By thinking multi-dimensionally and
00:21:36.000 keeping our context clean, we're able to
00:21:37.919 successfully steer the LLM and give
00:21:40.240 great results on things it's seen
00:21:42.080 before. But this kind of goes out the
00:21:45.440 window when we start dealing with newer
00:21:46.960 and more niche technologies.
00:21:49.280 Right now, I know I know you said, Jake,
00:21:51.919 just before you said, uh, more context
00:21:54.799 doesn't get better results.
00:21:57.120 But did you see that that little
00:21:58.640 asterisk right there? Yeah. Well, it
00:22:02.000 turns out, uh, we ran into a touch of
00:22:04.000 trouble with that. Uh, our colleague
00:22:06.320 null colleague Null was building a
00:22:08.080 service with Rails 8 uh, earlier this
00:22:10.480 year, which had just been released. And
00:22:12.799 when the AI reviewed his code, it
00:22:14.960 repeatedly flagged valid changes as
00:22:16.799 incorrect because LM was trained on
00:22:19.919 Rails 7. Null was unsurprised that the
00:22:22.400 AI was lackluster and I was crushed.
00:22:25.760 But this isn't just a Rails issue. It's
00:22:27.760 a universal challenge with LLMs. When
00:22:30.400 dealing with cutting edge technology,
00:22:31.840 they just haven't been trained on it.
00:22:33.440 And that's where we have to help them
00:22:34.880 out figure what is happening. So here we
00:22:37.840 realize we need another tool from our AI
00:22:39.840 tool belt. Uh everyone's favorite rag.
00:22:42.880 We specifically uh arg loaded up with
00:22:45.919 correct documentation, change logs,
00:22:48.080 things like that that an LLM agent can
00:22:50.240 search through in order to augment its
00:22:52.240 own knowledge base.
00:22:54.480 So, we're able to provide this
00:22:55.840 authoritative source uh to our LLMs on
00:22:58.960 new and more niche technologies,
00:23:00.880 increasing its knowledge on
00:23:02.080 state-of-the-art systems.
00:23:05.840 So, time and time again when I've been
00:23:09.039 working in the AI space, I hear my
00:23:10.960 colleagues say things like, you know,
00:23:12.720 Jake, I want to build a rag and load
00:23:14.480 Slack into it, or I need to fine-tune an
00:23:16.880 LLM uh agent to help me refactor all my
00:23:19.360 Rails controllers.
00:23:21.600 In our example, uh creating a rag or
00:23:24.320 even fine-tuning a foundation model kind
00:23:27.200 of makes sense, uh where we need to
00:23:29.360 bring in our state-of-the-art
00:23:30.720 technology. But consider your own case.
00:23:34.559 Is it really worth the effort? Are you
00:23:37.679 working on something so cutting edge
00:23:39.840 that you can't use existing knowledge?
00:23:43.360 Though, we've all been there. The hype
00:23:45.600 is real and so are the capabilities.
00:23:48.080 Focusing first on a specific tool or
00:23:50.159 technique can easily lock you into an
00:23:52.559 over complicated solution. If your real
00:23:55.360 goal is to simply analyze how often you
00:23:57.520 fix fix someone's slow Rails performance
00:24:00.080 issues by swapping over from joins to
00:24:02.960 includes, setting up an entire rag
00:24:05.440 system just to ingest and analyze git
00:24:07.280 data is not only overkill but the wrong
00:24:10.159 approach. Simply building a uh using
00:24:14.159 built-in search API will probably get
00:24:16.240 your job done and faster.
00:24:19.760 AI is great at finding and applying
00:24:21.200 patterns to problem spaces. condensing
00:24:23.679 information and again augmenting you by
00:24:27.200 providing different perspectives, not
00:24:29.360 the final one.
00:24:31.600 So what does it mean when your project
00:24:33.760 turns out that hey their foundation
00:24:36.880 model with some extra context and a good
00:24:39.760 prompt is all you really need?
00:24:42.320 Well, it's going to happen. Uh it seems
00:24:45.919 like, you know, there's a new
00:24:46.799 breakthrough in AI every few weeks. You
00:24:49.760 start on a project and then someone else
00:24:51.679 posts the blinks about something similar
00:24:53.600 launching.
00:24:55.120 You know, it's tempting to keep pivoting
00:24:56.559 away and thinking someone else has
00:24:58.799 gotten there first. But our OG rubists
00:25:02.000 know that when something is fun and easy
00:25:03.520 to work on, there's lots of innovation
00:25:05.279 around it. Your project's value doesn't
00:25:08.400 vanish just because another cool idea
00:25:11.520 has taken it already.
00:25:14.240 You're learning.
00:25:16.320 You're driving us forward. Stick with
00:25:18.400 it. explore. The space is very very fun
00:25:22.640 and as Justin mentioned, it's very close
00:25:25.279 to magic to work in. AI moves fast, yes,
00:25:29.120 but sustainable progress comes from
00:25:30.880 consistent effort, iteration, and your
00:25:33.360 dedication to a core idea. Finish what
00:25:36.480 you start, learn from the journey, and
00:25:38.480 only then adopt and integrate new
00:25:40.240 technologies as they truly benefit your
00:25:42.320 solution.
00:25:44.720 So when you do adopt a new technology,
00:25:48.640 uh try your best to write your own
00:25:50.000 interfaces for it. The world of AI as
00:25:52.960 we've mentioned is moving so fast and
00:25:54.799 has so many various tools uh with very
00:25:57.520 very different options. It can be easily
00:26:00.559 uh to overintegrate these into your core
00:26:03.200 solution and then become dependent on
00:26:05.520 them.
00:26:07.120 This gives you FOMO and uh when new
00:26:10.000 tools comes out increases your
00:26:11.679 maintenance costs and uh
00:26:15.679 gets you pulled around when new tools
00:26:17.039 come out.
00:26:19.200 So finally uh when we're testing with
00:26:22.880 LMS uh it can be really really
00:26:25.120 challenging to do. So even a single
00:26:28.080 change word in a prompt senior to staff
00:26:31.600 we'll say can significantly impact the
00:26:34.480 results. uh on our own codebase. Uh I
00:26:37.279 changed that exact word set and suddenly
00:26:40.720 the comments we started getting back
00:26:41.919 were a lot more grumpy.
00:26:45.600 Uh clearly understand the effects of
00:26:47.440 changes. Whether you're uh adding more
00:26:49.440 context, splitting the prompts into
00:26:50.880 multiple steps or switch simply
00:26:52.880 switching out the underlying model, you
00:26:54.640 need the ability to replay and compare
00:26:56.480 past inputs to your new results. So it's
00:26:59.919 crucial to think carefully about data
00:27:01.600 storage strategies from the start and
00:27:03.600 decide not just how to store data but
00:27:06.000 what's to store as well. Tools like Git
00:27:08.320 and Slack easily capture inputs and
00:27:10.240 outputs but miss oftentimes the thinking
00:27:13.360 stages in between. So capturing these
00:27:16.000 invaluable assets or these invaluable
00:27:18.000 steps will provide insights into
00:27:19.520 diagnosing issues and evaluating changes
00:27:21.840 later on. So wrapping us up here um
00:27:26.159 couple core lessons I'd love for you to
00:27:27.520 take away. First, remember the AI thinks
00:27:30.080 multi-dimensionally.
00:27:31.840 Gently guide it with familiar references
00:27:33.600 that it knows. Keep your prompts
00:27:36.240 precise. Clarity and unambiguous
00:27:38.720 statements are critical. Bridge
00:27:41.279 knowledge gaps with tools like Rag.
00:27:43.679 Don't chase every new trend. Focus on
00:27:46.080 steady, sustainable progress, even if
00:27:48.320 someone else is working on it too. avoid
00:27:50.960 vendor lock in and as much as keep as
00:27:52.880 much of your data as possible. By
00:27:54.960 applying these lessons to leverage AI to
00:27:57.520 enhance our workflows, it'll empower our
00:27:59.919 teams and deliver truly impactful
00:28:01.679 results. Thank you.
Explore all talks recorded at RailsConf 2025
Ben Sheldon
Sam Poder
Rhiannon Payne
Joe Masilotti
Josh Puetz
Wade Winningham
Irina Nazarova
Tess Griffin
+77