Chime Presents: Getting More Out of LLMs for Ruby Developers

00:00:16.720 Nice to be here everyone. My name is

00:00:18.080 Justin Wowski. I'm a senior software

00:00:19.760 engineer at Chime. Uh and this is going

00:00:22.720 to be a turbo mode lightning talk. This

00:00:25.840 talk is a compressed version of the

00:00:28.560 training that we have started to offer

00:00:30.480 to our Rails engineers to learn how to

00:00:32.320 use AI coding tools. So by the end of

00:00:34.719 this talk, you'll know any everything

00:00:36.000 that you need to get started with AI

00:00:37.440 coding and take that knowledge back to

00:00:39.680 your teams hopefully and hope and help

00:00:41.760 encourage adoption if this is a tool

00:00:43.760 that you think will help you. So what

00:00:46.399 are we why are we doing this? Uh we

00:00:48.640 believe that AI coding agents are a new

00:00:50.239 tool that can make our lives better. Uh

00:00:52.559 at its core, programming is about

00:00:54.480 juggling a lot of detailed specific

00:00:56.480 knowledge.

00:00:58.000 AI assisted coding is like working with

00:01:00.079 someone who has memorized the internet.

00:01:03.199 This can be incredibly powerful, but of

00:01:06.000 course, like any tool, it has strengths

00:01:08.080 and weaknesses.

00:01:10.320 Rails development plays to the strengths

00:01:12.240 of AI. The Rails community has a great

00:01:15.040 culture of sharing. We have good quality

00:01:17.600 open- source tools and code. We have

00:01:20.080 lots of tutorials and documentation and

00:01:22.720 we have this strong culture around using

00:01:24.560 common patterns and all of these things

00:01:27.200 are things that AI can help us leverage.

00:01:30.240 AI is also great at automating the

00:01:32.400 boring stuff. This can actually make our

00:01:35.040 jobs more fun as programmers. So it's

00:01:37.680 not just about productivity. It can

00:01:39.759 actually be about our enjoyment and our

00:01:42.560 satisfaction and our growth in learning.

00:01:44.400 Also,

00:01:47.920 agents reduce the cost of writing code.

00:01:51.840 This is their real superpower, as I hope

00:01:54.159 you'll see.

00:01:56.640 Reducing the cost of writing code

00:01:58.159 changes how we work. It lets us focus

00:02:01.200 more time and effort on the things that

00:02:02.880 we can do better than computers like

00:02:06.719 decomposing tasks into small safe steps,

00:02:10.160 object-oriented design, innovating to

00:02:13.200 solve domain specific problems in

00:02:16.000 evaluating the trade-offs in different

00:02:17.599 implementations

00:02:19.120 and reviewing code and improving code

00:02:21.200 quality.

00:02:24.160 So to get started with AI coding, the

00:02:26.080 first thing you need to do is to select

00:02:27.680 a tool. Don't spend too much time on

00:02:29.840 this. Don't overthink it. This is the

00:02:31.599 gold rush era of tools right now. There

00:02:33.760 is a ton of them. They're evolving

00:02:35.840 constantly. Keeping track of this stuff

00:02:38.080 is a full-time job by itself, and it's

00:02:40.800 probably not worth your time. Experience

00:02:44.000 is going to be better than research. A

00:02:46.160 day or two of experimenting with a tool

00:02:49.360 will teach you everything you need to

00:02:50.879 know. Now, tools are also improving

00:02:53.760 quickly. So even a tool that wasn't very

00:02:56.560 good a number of months ago might be

00:02:58.879 reasonable today. Don't be afraid to

00:03:01.040 experiment and don't be afraid to repeat

00:03:04.000 your evaluations in the future.

00:03:07.120 This is a list of the AI coding tools

00:03:09.040 that we currently use at Chime. Cursor

00:03:11.920 is our most widely used tool and so I'm

00:03:14.080 going to quickly show you how to get set

00:03:15.840 up and make Cursor a reasonable

00:03:18.000 environment for Rails development.

00:03:20.879 Let's get started.

00:03:23.040 First, you got to download Cursor. Sign

00:03:24.800 up for an account. Now, Cursor is a fork

00:03:27.360 of VS Code. Now, a lot of Rails

00:03:30.239 developers are rightly so hesitant about

00:03:32.959 using a VS Code environment for for Ruby

00:03:35.280 and Rails development because for a long

00:03:37.360 time it was really awful. Um, the

00:03:40.560 general sentiment at Chime is that this

00:03:42.640 is still not as good as Ruby and Rails

00:03:45.280 specific idees, but it's very

00:03:47.440 serviceable now. It you can actually be

00:03:49.680 productive with it. So once you get

00:03:52.080 installed, first you have to install

00:03:53.599 extensions. You can open the user

00:03:56.159 interface. You click here. You click

00:03:58.799 here. And then you can search the

00:04:00.720 marketplace for extensions.

00:04:03.200 Our friends at Shopify have an excellent

00:04:05.200 Ruby extension which is actually an

00:04:06.799 extension pack that bundles a number of

00:04:09.360 extensions together along with some

00:04:10.959 custom configuration. It is, we believe,

00:04:13.920 the best way to get started and it's

00:04:15.519 super easy.

00:04:18.320 You also might want to install some of

00:04:19.840 these extensions. They can be helpful.

00:04:22.400 One note is that you do not need a a

00:04:24.479 separate rubocop extension. And in fact,

00:04:26.560 installing multiple of them can create

00:04:28.960 conflicts and odd behavior. The Ruby LSP

00:04:31.600 extension actually integrates with

00:04:33.680 Rubocop directly. So, it's usually not

00:04:35.520 necessary.

00:04:38.639 So, even though cursor is much improved,

00:04:41.280 sometimes it's going to need some

00:04:42.320 troubleshooting. So if things aren't

00:04:44.400 working right, if things seem weird,

00:04:46.560 first click here and open the bottom

00:04:49.360 panel, click the output tab, and then

00:04:52.400 use the selector to look at different

00:04:54.479 log files. This is the first uh the

00:04:57.360 first opportunity for troubleshooting

00:04:58.960 and debugging. Now, some of these errors

00:05:01.759 are going to like this one that Ruby LSP

00:05:04.080 is showing don't necessarily affect the

00:05:06.400 functionality, so you don't necessarily

00:05:08.400 have to fix it. But if things are weird

00:05:10.800 in the UI or things like code navigation

00:05:13.039 aren't working, this is your first stop.

00:05:16.800 Now remember that Ruby LSP is a separate

00:05:19.199 external process. That means that if it

00:05:21.680 it if it encounters errors, it'll output

00:05:24.080 logs into a separate directory in your

00:05:25.919 project route. So also check here.

00:05:29.039 Now once all the extensions are working

00:05:30.560 and your environment is reasonably

00:05:32.320 stable, the next step is to customize

00:05:34.320 your settings.

00:05:36.720 uh VS code and cursor by extension has

00:05:39.039 multiple levels of settings. The easiest

00:05:40.880 way to access it is the searchable

00:05:42.800 command pallet. This allows you to open

00:05:44.880 this either you can open the JSON

00:05:46.960 settings file directly which is great.

00:05:49.360 It has great autocomplete. It's really

00:05:50.800 easy to use or there's a guey interface.

00:05:53.840 Also, one of the most common problems

00:05:56.639 that people run into with cursor and

00:05:58.479 Ruby LSP is your version manager. So

00:06:01.840 Ruby LSP will attempt to autodetect your

00:06:04.080 version manager and use it

00:06:05.520 automatically. But if you have multiple

00:06:07.759 installed, it might not guess correctly.

00:06:10.400 So you can override it with this setting

00:06:11.919 here.

00:06:14.720 Another troubleshooting tip is to start

00:06:16.639 cursor from the command line within your

00:06:18.800 Rails project. This will help it ensure

00:06:21.199 that it loads the environment correctly

00:06:23.120 and eliminates a lot of problems.

00:06:26.080 Okay, so that's the cursor specific

00:06:28.000 part. Let's talk about now. Oh, this is

00:06:30.479 not excuse me. Let's talk about how to

00:06:32.639 use cursor's agent. Now, a lot of this

00:06:34.960 is specific to cursor, but some of it

00:06:36.800 can be generalized.

00:06:38.560 So, to open cursors agent, you click

00:06:40.560 here to open the right- hand panel.

00:06:43.759 The agent has two main modes, ask and

00:06:46.319 agent. Ask is just for discussion,

00:06:49.440 investigation,

00:06:50.960 searching,

00:06:52.479 uh discussing ideas. You'll likely spend

00:06:55.360 most of your time in agent mode. Now,

00:06:57.520 agent mode is pretty eager about writing

00:06:59.599 and editing code, but any code change

00:07:02.319 that it makes will be highlighted in the

00:07:04.160 editor, and you have to manually accept

00:07:06.560 or reject each change.

00:07:09.039 The agent's pretty smart, and it can

00:07:10.880 also run tests. It can run external

00:07:12.639 tools, and it will by default prompt you

00:07:15.440 before doing anything like that. So,

00:07:17.680 it's a great way to get started.

00:07:20.720 Another trick is that there's some slash

00:07:22.400 commands available that allow you to

00:07:24.479 quickly get to common functions.

00:07:27.360 in cursor. Uh a lot of these are related

00:07:29.680 to context. So context is primarily

00:07:32.639 about uh directing the agent to the

00:07:35.360 source code that it should work on and

00:07:37.039 pay attention to. So the agent can read

00:07:40.319 all of the code in your project. But

00:07:43.039 this is how you get it to focus on the

00:07:44.880 task at hand.

00:07:47.599 You can add files and folders to the

00:07:49.440 agents context through this menu here.

00:07:52.400 And you can also use the at symbol to

00:07:54.800 mention specific files in the prompt

00:07:56.800 itself. You can also drag files from

00:07:59.360 tabs or from the explorer view into the

00:08:01.520 prompt.

00:08:03.520 So that's a basic introduction to

00:08:05.199 cursor. It's really pretty easy to use

00:08:08.000 and it's come a long way in the last few

00:08:09.919 years. So now let's talk more generally

00:08:13.599 about how you can actually use coding

00:08:15.199 agents to get work done.

00:08:18.240 This section applies to all coding

00:08:19.599 agents, not just cursor. Different

00:08:21.520 agents have different strengths and

00:08:23.440 weaknesses, but we're finding that a lot

00:08:25.680 of their overall features and

00:08:26.879 intelligence is converging pretty

00:08:28.400 quickly. We're often seeing multiple

00:08:30.720 updates and releases from these various

00:08:32.880 different tools every week.

00:08:35.919 Like any new programming tool, coding

00:08:37.599 agents require us to build some new

00:08:39.279 skills. One of those skills is prompt

00:08:41.599 engineering.

00:08:43.360 Now, prompt writing is not a science.

00:08:46.000 It's more like magic. You have to create

00:08:48.800 the right setting, invoke the right

00:08:51.360 spirits, and find the right incantation.

00:08:54.640 And you're not going to get it right the

00:08:56.080 first time.

00:08:57.839 Uh, you really want to iterate and learn

00:09:00.240 as you as you go.

00:09:02.800 As you learn how to effectively prompt

00:09:04.560 the agent and get it to do what you

00:09:06.399 want, you're building a spellbook.

00:09:10.080 And this is not just in a general sense.

00:09:12.480 when uh there's going to be a lot of

00:09:14.240 tasks just like most programming where

00:09:16.640 we do the same tasks over and over again

00:09:19.120 and you're going to find yourself

00:09:20.399 repeating prompts. So you can actually

00:09:23.279 build up composable and reusable prompts

00:09:27.120 and so this spellbook is a valuable new

00:09:29.120 tool for us.

00:09:31.600 Now most agents support some kind of

00:09:33.440 rules which are essentially permanent

00:09:36.080 prompts. Cursor's feature is called

00:09:38.560 rules for AI. Um, and clawed code

00:09:41.920 supports the uh will automatically read

00:09:44.080 the claw.md file if it exists.

00:09:48.160 Uh, so these features allow you to give

00:09:50.560 your agent permanent instructions and

00:09:52.720 it's a great way to provide context and

00:09:54.720 information to provide your style guide

00:09:56.640 to the agent because remember that

00:09:58.399 agents are really good at understanding

00:10:00.959 plain English. So this doesn't

00:10:03.600 necessarily have to be specific

00:10:05.200 instructions. It can be more general.

00:10:09.120 um agent rules.

00:10:14.320 So those are the basic tools to use an

00:10:17.680 agent. But how can you use them

00:10:19.120 effectively? So one thing that's that

00:10:21.519 everybody's thinking about these days is

00:10:22.959 coding workflows with agents. And this

00:10:25.200 is a huge area of experimentation.

00:10:27.920 But there are some ideas that you can

00:10:29.360 use as landmarks to help you find a path

00:10:31.360 and get started.

00:10:33.279 One way to think about coding workflows

00:10:35.120 is this spectrum of agent agency or

00:10:37.519 autonomy or independence. Uh how

00:10:40.640 independently is the agent acting? Um

00:10:43.360 our friends at ThoughtBot have a great

00:10:44.800 blog post talks about this in terms of

00:10:46.800 human-led programming versus agentled

00:10:49.440 programming.

00:10:51.600 The simplest form is tab completion. Um

00:10:54.399 sometimes people uh knock on this or

00:10:57.120 kind of skip over it, but I think it can

00:10:59.040 actually be really valuable and it can

00:11:00.399 be a perfectly fine place to start. you

00:11:02.560 know, especially if you're writing

00:11:03.680 boilerplate code or glue code. This can

00:11:06.560 actually save you a lot of keystrokes.

00:11:10.079 Another option is pair programming.

00:11:12.560 Agents are probably going to be the most

00:11:14.640 convenient pair that most of us have an

00:11:17.120 opportunity to work with. It's a great

00:11:19.839 way to discuss design ideas, try out

00:11:22.399 different implementations,

00:11:24.240 and this is where you'll start to save a

00:11:25.920 lot of time by having the agent write

00:11:27.680 the code for you. The trade-off, of

00:11:30.240 course, is that you have to review their

00:11:31.519 work.

00:11:33.120 So, the pairing approach is great for

00:11:34.880 things like scaffolding and glue code,

00:11:37.519 repetitive and boring refactorings, or

00:11:40.079 trying different design options, trying

00:11:41.600 different implementation approaches.

00:11:45.440 So, when you're pairing, you're going to

00:11:47.279 be having this continuous conversation,

00:11:49.040 this back and forth with the agent.

00:11:51.920 What if you want the agent to work more

00:11:53.279 independently? So one option is to

00:11:56.000 automate what you would do. So take a

00:11:59.279 task, write a detailed plan, and let the

00:12:02.480 agent execute it. You're automating your

00:12:04.880 own workflow using the agent.

00:12:08.240 Now you can develop this workflow with

00:12:10.399 the agent's help. You don't have to do

00:12:11.760 it all on your own, but the important

00:12:13.200 part is that your judgment is what's

00:12:15.440 going into structuring this task.

00:12:18.880 So when you're doing this, the more uh

00:12:21.440 independence that you give the agent,

00:12:23.279 you do have to think about guard rails,

00:12:25.440 checkpoints. How do you make sure that

00:12:27.360 you are supervising the agents work? And

00:12:30.079 that's one thing that that prompt

00:12:31.519 spellbook will help you build up.

00:12:34.800 Agents are starting to incorporate

00:12:36.320 planning features. Um, but I still like

00:12:38.320 to write my plan down in files. Um, this

00:12:40.720 approach is uh adapted from uh one of

00:12:43.360 Harper Reed's blog posts.

00:12:45.600 uh you can develop a plan in a file uh

00:12:48.240 create a to-do list so that the agent

00:12:50.000 can track its own progress and this is

00:12:52.480 very useful because sometimes agents get

00:12:54.639 stuck in loops or sometimes agents uh

00:12:58.079 when they start to compact the context

00:12:59.839 window begin to generate odd results and

00:13:02.720 you basically have to do you have to go

00:13:05.120 back to the Windows 95 era and reboot.

00:13:08.240 So these files give you some state so

00:13:10.880 that the agent can pick up where it left

00:13:12.560 off.

00:13:14.880 There's another benefit to this too,

00:13:16.320 which is that by storing your thought

00:13:18.880 process in these files, you can begin to

00:13:21.600 start to recognize the common pieces and

00:13:24.320 identify prompts that are reusable or

00:13:26.160 that could be generalized.

00:13:28.800 Uh some people using cloud code find

00:13:30.639 that fewer files or even a single file

00:13:32.800 works better. So this is also an area to

00:13:34.959 experiment on.

00:13:37.200 A fourth approach gives the agent

00:13:38.800 maximum independence. Uh, and you can

00:13:41.440 even have agents work on multiple tasks

00:13:43.600 in the background. Now, so this is where

00:13:46.000 you treat the agent like a separate

00:13:47.360 programmer where you just assign it a

00:13:48.880 task and it goes off and it works and it

00:13:52.320 comes back with maybe a completed branch

00:13:54.160 or even a pull request.

00:13:56.480 And it only is going to bug you if it

00:13:58.320 thinks it's stuck or if you notice that

00:14:00.399 it's stuck.

00:14:02.800 So, this is an exciting idea. Uh, but

00:14:05.279 there's a lot of details that go into

00:14:06.880 making it work effectively. And this is

00:14:08.560 one of the areas that's really in rapid

00:14:10.399 development right now. This is best for

00:14:12.880 things where you have a very specific

00:14:14.639 scope, a clear completion criteria.

00:14:18.320 Um, and remember that agents are best at

00:14:22.000 essentially generating the code that is

00:14:23.680 the average of of the entire internet.

00:14:26.399 So this is best for simple problems with

00:14:29.120 clear solutions.

00:14:31.920 This mode is what Kentbeck refers to as

00:14:33.839 genie mode because it'll do its best to

00:14:36.639 fulfill what you your wish, it often

00:14:39.600 does so imperfectly until you learn to

00:14:42.720 express your wish in just the right way.

00:14:48.560 Now, even though we've gone through

00:14:50.480 these kind of in order, these are not

00:14:52.560 steps on a path. These are options that

00:14:55.600 you can pick for the task at hand.

00:14:58.240 So you might choose the pairing approach

00:14:59.839 for designing explorations,

00:15:02.240 scaffolding and simple things,

00:15:03.839 double-checking your test or writing

00:15:05.839 tests or writing an implementation based

00:15:08.320 on a test. Uh pairing mode is also great

00:15:11.440 for learning.

00:15:13.760 Using a detailed plan is good for

00:15:15.600 larger, more complex tasks where the

00:15:17.600 human element is important, where you're

00:15:20.160 innovating solutions to unique or domain

00:15:22.480 specific problems, and where you need to

00:15:25.120 bring your knowledge to bear.

00:15:28.160 And the coding genie can be great for

00:15:30.079 automating relatively simple tasks,

00:15:32.320 small to medium complexity that follow

00:15:35.040 standard patterns and have standard

00:15:36.639 solutions.

00:15:40.160 And as always, we're at the very

00:15:42.240 beginning of this experiment. Uh, and I

00:15:44.720 hope that this will be an exciting way

00:15:46.800 to change the uh to explore different

00:15:49.199 ways of working and to make programming

00:15:51.440 fun in new ways.

00:15:55.279 So, pick a tool, almost any tool. Build

00:15:58.240 your prompt spellbook, start with

00:16:00.240 pairing, get to know your agent,

00:16:03.040 automate the boring stuff, make your

00:16:04.800 life easier,

00:16:06.560 and experiment with workflows.

00:16:09.920 There's some resources here. Don't worry

00:16:11.440 about copying these down. We'll post

00:16:12.639 these slides in in the Slack channel.

00:16:17.519 And that is my lightning talk on AI

00:16:20.240 coding.

00:16:28.959 All right, thanks everyone. Uh, my name

00:16:31.360 is Jake Grishaw. I've been a Chime

00:16:33.680 engineer now for about six years and

00:16:36.160 interested in machine learning and AI

00:16:38.480 for around eight. Uh, today I'm going to

00:16:41.120 talk to you a little bit about some of

00:16:42.399 the valuable lessons I've learned when

00:16:44.079 coding up Beacon, which is Chime's AI

00:16:46.639 powered code review bot. Uh just before

00:16:49.040 we jump in though, I'd like to quickly

00:16:50.720 share a little bit of my philosophy

00:16:51.920 around uh integrating AI into our

00:16:54.000 software development life cycle. So I

00:16:57.040 believe that you know AI is truly

00:16:59.519 transformative for us. It's an enhancing

00:17:01.600 power and embracing AI tooling means

00:17:04.319 actively cultivating a relationship

00:17:06.000 between us and it leveraging AI's

00:17:09.360 efficiency, knowledge, and scalability

00:17:12.079 without letting it supplant human

00:17:13.839 insight, creativity, and wisdom. When AI

00:17:16.880 fully integr automates our tasks, it

00:17:19.120 erodess our skills, dulling our lives

00:17:21.760 and ultimately reducing the quality of

00:17:23.360 our work. We don't merely want to press

00:17:26.000 run when prompted.

00:17:28.240 Uh instead we want to thoughtfully

00:17:30.240 integrate AI into our workflow using it

00:17:32.720 to amplify our capabilities and freeing

00:17:35.039 ourselves to tackle the complex

00:17:36.400 challenges that truly require human

00:17:37.919 knowledge and judgment like when to

00:17:40.080 include uh use includes versus joins and

00:17:43.280 active record and that lovely N plus one

00:17:45.360 that we talked about earlier.

00:17:48.799 So why did we build beacon in the first

00:17:51.440 place? Today, code is being written

00:17:53.840 faster than ever, meaning maintainers

00:17:55.760 are facing pressure to review, provide

00:17:58.320 feedback, and decision on incoming

00:18:00.240 requests. With Beacon, we saw an

00:18:02.480 opportunity to shift some of that burden

00:18:04.240 off of maintainers by prompting

00:18:06.400 developers to clearly articulate their

00:18:08.240 intentions and educate them on common

00:18:10.160 pitfalls.

00:18:12.400 Uh asking if they have things like

00:18:14.720 consider different approaches and um

00:18:19.440 random tools. uh developers can iterate

00:18:22.240 commonly uh independently and present

00:18:24.320 more polished and robust products to

00:18:26.480 these maintainers. We like to think of

00:18:29.120 it as a less deterministic Rubikop. Its

00:18:31.840 suggestions are able to go beyond

00:18:33.280 finding just issues but also look for

00:18:35.600 style existing patterns in our codebase.

00:18:38.080 And like Rubicop, sometimes the

00:18:40.000 suggestions aren't appropriate for the

00:18:42.400 situation. It gives our developers

00:18:44.640 confidence that they're on the right

00:18:45.840 track though and informs them when

00:18:47.760 they're using outdated practices.

00:18:51.600 In the end, we're able to build a

00:18:53.280 chimeified version of our review bot

00:18:55.440 that our engineers felt outperformed

00:18:57.440 commercial applications. But I'm here to

00:19:00.799 talk mostly about the key lessons we

00:19:02.960 learned and creating an AI agent. So,

00:19:06.080 let's take a little bit step back,

00:19:08.000 right?

00:19:09.600 Uh I'd like to uh have you imagine for a

00:19:12.480 moment a pink elephant.

00:19:15.440 No. Uh hopefully you started picturing

00:19:17.679 some sort of a brightly colored

00:19:19.039 packaderm. Uh but perhaps other ideas

00:19:21.520 started popping up. Things like

00:19:23.039 childhood memories, uh Dumbo from Disney

00:19:25.520 or a joke. Perhaps even something

00:19:27.840 completely unrelated like Balders's

00:19:30.320 Gate. Why does this happen?

00:19:33.679 Well, it's because our brains, much like

00:19:36.400 LLM, encode information in a rich

00:19:38.799 multi-dimensional space, right? We kind

00:19:41.200 of think about uh a graph uh or cone in

00:19:45.840 this dimensional space when we add in

00:19:48.080 prompting. So mentioning an idea

00:19:50.480 suddenly activates many related ideas

00:19:53.120 shaping our perceptions, thoughts and

00:19:55.840 ultimately our actions.

00:19:58.880 Similarly, guide an LLM. We can guide an

00:20:01.600 LLM by creating these contextual

00:20:03.440 shortcuts. For instance, if you identify

00:20:06.160 that an LM was likely trained on a

00:20:09.039 certain common resource, you can

00:20:11.120 reference that concept in that resource

00:20:13.679 by including recognizable fingerprints

00:20:16.000 like specific URLs or key phrases in

00:20:18.799 your prompt.

00:20:21.280 By embedding these identifiers into your

00:20:23.120 prompts, you've effectively shaped the

00:20:25.760 way that the LM will say thinks about it

00:20:29.120 subconsciously, right? nudging it

00:20:31.600 towards desired outputs without actually

00:20:33.919 excessively increasing the PL prompt

00:20:36.080 complexity. This helps with a bit of

00:20:38.640 complication that we often run into,

00:20:41.840 which is the more context isn't always

00:20:44.640 better.

00:20:46.159 Early on in building Beacon, we assumed

00:20:48.480 providing lots of background information

00:20:50.720 would lead to richer insights, but we

00:20:52.960 quickly learned that wasn't really true.

00:20:55.600 In fact, huge prompt instructions or

00:20:57.600 overly broad context often dilutes the

00:21:00.559 clarity of a request, expanding the

00:21:03.120 potential outcomes into a confusing

00:21:05.120 array of possibilities.

00:21:08.159 The more precise your request, the more

00:21:10.159 precise and actionable the responses

00:21:11.679 will be. So, one of the more valuable

00:21:14.080 lessons we learned was to keep the scope

00:21:16.000 of each interaction, aka an LLM call,

00:21:19.760 uh, very narrow and clear.

00:21:22.400 Critical checks like security reviews,

00:21:24.799 database changes often require their own

00:21:27.760 dedicated request to produce uh

00:21:30.960 actionable feedback.

00:21:33.840 By thinking multi-dimensionally and

00:21:36.000 keeping our context clean, we're able to

00:21:37.919 successfully steer the LLM and give

00:21:40.240 great results on things it's seen

00:21:42.080 before. But this kind of goes out the

00:21:45.440 window when we start dealing with newer

00:21:46.960 and more niche technologies.

00:21:49.280 Right now, I know I know you said, Jake,

00:21:51.919 just before you said, uh, more context

00:21:54.799 doesn't get better results.

00:21:57.120 But did you see that that little

00:21:58.640 asterisk right there? Yeah. Well, it

00:22:02.000 turns out, uh, we ran into a touch of

00:22:04.000 trouble with that. Uh, our colleague

00:22:06.320 null colleague Null was building a

00:22:08.080 service with Rails 8 uh, earlier this

00:22:10.480 year, which had just been released. And

00:22:12.799 when the AI reviewed his code, it

00:22:14.960 repeatedly flagged valid changes as

00:22:16.799 incorrect because LM was trained on

00:22:19.919 Rails 7. Null was unsurprised that the

00:22:22.400 AI was lackluster and I was crushed.

00:22:25.760 But this isn't just a Rails issue. It's

00:22:27.760 a universal challenge with LLMs. When

00:22:30.400 dealing with cutting edge technology,

00:22:31.840 they just haven't been trained on it.

00:22:33.440 And that's where we have to help them

00:22:34.880 out figure what is happening. So here we

00:22:37.840 realize we need another tool from our AI

00:22:39.840 tool belt. Uh everyone's favorite rag.

00:22:42.880 We specifically uh arg loaded up with

00:22:45.919 correct documentation, change logs,

00:22:48.080 things like that that an LLM agent can

00:22:50.240 search through in order to augment its

00:22:52.240 own knowledge base.

00:22:54.480 So, we're able to provide this

00:22:55.840 authoritative source uh to our LLMs on

00:22:58.960 new and more niche technologies,

00:23:00.880 increasing its knowledge on

00:23:02.080 state-of-the-art systems.

00:23:05.840 So, time and time again when I've been

00:23:09.039 working in the AI space, I hear my

00:23:10.960 colleagues say things like, you know,

00:23:12.720 Jake, I want to build a rag and load

00:23:14.480 Slack into it, or I need to fine-tune an

00:23:16.880 LLM uh agent to help me refactor all my

00:23:19.360 Rails controllers.

00:23:21.600 In our example, uh creating a rag or

00:23:24.320 even fine-tuning a foundation model kind

00:23:27.200 of makes sense, uh where we need to

00:23:29.360 bring in our state-of-the-art

00:23:30.720 technology. But consider your own case.

00:23:34.559 Is it really worth the effort? Are you

00:23:37.679 working on something so cutting edge

00:23:39.840 that you can't use existing knowledge?

00:23:43.360 Though, we've all been there. The hype

00:23:45.600 is real and so are the capabilities.

00:23:48.080 Focusing first on a specific tool or

00:23:50.159 technique can easily lock you into an

00:23:52.559 over complicated solution. If your real

00:23:55.360 goal is to simply analyze how often you

00:23:57.520 fix fix someone's slow Rails performance

00:24:00.080 issues by swapping over from joins to

00:24:02.960 includes, setting up an entire rag

00:24:05.440 system just to ingest and analyze git

00:24:07.280 data is not only overkill but the wrong

00:24:10.159 approach. Simply building a uh using

00:24:14.159 built-in search API will probably get

00:24:16.240 your job done and faster.

00:24:19.760 AI is great at finding and applying

00:24:21.200 patterns to problem spaces. condensing

00:24:23.679 information and again augmenting you by

00:24:27.200 providing different perspectives, not

00:24:29.360 the final one.

00:24:31.600 So what does it mean when your project

00:24:33.760 turns out that hey their foundation

00:24:36.880 model with some extra context and a good

00:24:39.760 prompt is all you really need?

00:24:42.320 Well, it's going to happen. Uh it seems

00:24:45.919 like, you know, there's a new

00:24:46.799 breakthrough in AI every few weeks. You

00:24:49.760 start on a project and then someone else

00:24:51.679 posts the blinks about something similar

00:24:53.600 launching.

00:24:55.120 You know, it's tempting to keep pivoting

00:24:56.559 away and thinking someone else has

00:24:58.799 gotten there first. But our OG rubists

00:25:02.000 know that when something is fun and easy

00:25:03.520 to work on, there's lots of innovation

00:25:05.279 around it. Your project's value doesn't

00:25:08.400 vanish just because another cool idea

00:25:11.520 has taken it already.

00:25:14.240 You're learning.

00:25:16.320 You're driving us forward. Stick with

00:25:18.400 it. explore. The space is very very fun

00:25:22.640 and as Justin mentioned, it's very close

00:25:25.279 to magic to work in. AI moves fast, yes,

00:25:29.120 but sustainable progress comes from

00:25:30.880 consistent effort, iteration, and your

00:25:33.360 dedication to a core idea. Finish what

00:25:36.480 you start, learn from the journey, and

00:25:38.480 only then adopt and integrate new

00:25:40.240 technologies as they truly benefit your

00:25:42.320 solution.

00:25:44.720 So when you do adopt a new technology,

00:25:48.640 uh try your best to write your own

00:25:50.000 interfaces for it. The world of AI as

00:25:52.960 we've mentioned is moving so fast and

00:25:54.799 has so many various tools uh with very

00:25:57.520 very different options. It can be easily

00:26:00.559 uh to overintegrate these into your core

00:26:03.200 solution and then become dependent on

00:26:05.520 them.

00:26:07.120 This gives you FOMO and uh when new

00:26:10.000 tools comes out increases your

00:26:11.679 maintenance costs and uh

00:26:15.679 gets you pulled around when new tools

00:26:17.039 come out.

00:26:19.200 So finally uh when we're testing with

00:26:22.880 LMS uh it can be really really

00:26:25.120 challenging to do. So even a single

00:26:28.080 change word in a prompt senior to staff

00:26:31.600 we'll say can significantly impact the

00:26:34.480 results. uh on our own codebase. Uh I

00:26:37.279 changed that exact word set and suddenly

00:26:40.720 the comments we started getting back

00:26:41.919 were a lot more grumpy.

00:26:45.600 Uh clearly understand the effects of

00:26:47.440 changes. Whether you're uh adding more

00:26:49.440 context, splitting the prompts into

00:26:50.880 multiple steps or switch simply

00:26:52.880 switching out the underlying model, you

00:26:54.640 need the ability to replay and compare

00:26:56.480 past inputs to your new results. So it's

00:26:59.919 crucial to think carefully about data

00:27:01.600 storage strategies from the start and

00:27:03.600 decide not just how to store data but

00:27:06.000 what's to store as well. Tools like Git

00:27:08.320 and Slack easily capture inputs and

00:27:10.240 outputs but miss oftentimes the thinking

00:27:13.360 stages in between. So capturing these

00:27:16.000 invaluable assets or these invaluable

00:27:18.000 steps will provide insights into

00:27:19.520 diagnosing issues and evaluating changes

00:27:21.840 later on. So wrapping us up here um

00:27:26.159 couple core lessons I'd love for you to

00:27:27.520 take away. First, remember the AI thinks

00:27:30.080 multi-dimensionally.

00:27:31.840 Gently guide it with familiar references

00:27:33.600 that it knows. Keep your prompts

00:27:36.240 precise. Clarity and unambiguous

00:27:38.720 statements are critical. Bridge

00:27:41.279 knowledge gaps with tools like Rag.

00:27:43.679 Don't chase every new trend. Focus on

00:27:46.080 steady, sustainable progress, even if

00:27:48.320 someone else is working on it too. avoid

00:27:50.960 vendor lock in and as much as keep as

00:27:52.880 much of your data as possible. By

00:27:54.960 applying these lessons to leverage AI to

00:27:57.520 enhance our workflows, it'll empower our

00:27:59.919 teams and deliver truly impactful

00:28:01.679 results. Thank you.

Chime Presents: Getting More Out of LLMs for Ruby Developers

Introduction

Key Points

Conclusions and Takeaways