Summarized using AI

The Rise of the Agent: Rails in the AI Era

Kinsey Durham Grace • September 04, 2025 • Amsterdam, Netherlands • Talk

Introduction

"The Rise of the Agent: Rails in the AI Era" is a talk by Kinsey Durham Grace presented at Rails World 2025. The session explores the growing significance of intelligent agents in software development, with a particular focus on integrating these agentic systems within the Ruby on Rails ecosystem.

Main Theme

The main topic is the transition from basic AI prompting toward more autonomous, agent-based systems in software, highlighting how these agents can be built, integrated, and utilized effectively in production—especially using Rails.

Key Points

  • Defining Agentic AI and Agents:

    • Agentic AI refers to AI systems with the ability to act autonomously, setting and executing goals without continuous human prompting.
    • Agents differ from traditional workflows; they adapt, make decisions, and chain tasks together, providing more flexibility than rigid, rule-based workflows.
  • Workflows vs. Agents:

    • Workflows excel at regimented, predefined tasks but lack adaptability.
    • Agents are model-based, capable of handling open-ended, unpredictable problems through learning and adaptation.
  • Types and Use Cases for Agents:

    • Various agent types: learning agents, utility-based, model-based, goal-based, and simple reflex agents.
    • Real-world applications include customer support, security threat detection, travel concierges, and especially coding agents, such as those used at GitHub.
  • Building Agents in Rails:

    • Essential building blocks discussed:
    • Tooling Layer: Recommends using MCP (Model Context Protocol) for secure interactions with external tools, using an example integrating Zenesk via MCP.
    • Memory: Separating short-term (state store) and long-term memory (timeline of agent thoughts), aiding context-awareness and resumability.
    • Agent Orchestration: Planning, executing, and supervising agent actions, implementing guardrails, and handling business logic, retries, and observability.
    • Planning: Techniques like subgoal decomposition, reflection, self-critique, and chain-of-thought are crucial for agent decision-making.
  • Best Practices for Agent Development:

    • Modular and maintainable design for rapid iteration.
    • Implementation of gates (guardrails) and validation, often with tools like dry-schema.
    • Keep prompt changes minimal to avoid unpredictable agent behavior.
    • Extensive testing using evaluations and test harnesses to maintain reliability.
    • Observability and explainability—tracking every action/decision for accountability and debugging.
  • Future Trends:

    • Agents are expected to become more human-like, with improved memory, personality, and the capacity to form user relationships.
    • Emergence of sub-agents (hierarchical responsibility) and workflow-native orchestration, blending traditional workflow tools with AI-driven steps.
    • Emphasis on using agents for augmentation rather than replacement of human developers.
  • Ethical Considerations:

    • Need for transparency, observability, and regular audits.
    • Addressing bias in LLMs and promoting ethical usage.
    • Implementation of security and retention policies, plus thorough dogfooding.

Takeaways

  • Intelligent agents in Rails represent a paradigm shift from procedural, prompt-based AI toward goal-oriented, context-aware systems.
  • Successful agent development requires robust architecture, best practices for reliability, and strong ethical safeguards.
  • Rails and Ruby ecosystems need to evolve and collaborate to remain relevant in the age of AI agents.
  • Agents should augment, not replace, human capability—enhancing productivity and allowing more focus on high-value activities.

The Rise of the Agent: Rails in the AI Era
Kinsey Durham Grace • Amsterdam, Netherlands • Talk

Date: September 04, 2025
Published: Mon, 15 Sep 2025 00:00:00 +0000
Announced: Tue, 20 May 2025 00:00:00 +0000

Prompting is a start, but agents are the future. In this talk, we'll explore how intelligent agents are changing how we build software — and how we can bring them to life in the Rails ecosystem. From early CLI bots to multi-tool agents powered by LLMs, we'll trace the evolution, show what's working in production, and give you practical patterns for building agents with a Rails mindset.

Rails World 2025

00:00:07.359 Hi everyone.
00:00:10.080 All right, welcome to the rise of the
00:00:12.160 agent. We're going to be talking about
00:00:14.080 agents today, specifically in Rails. Uh
00:00:17.359 I'm Kinsey. It's nice to meet you all.
00:00:19.600 I'm from Denver, Colorado. I am on the
00:00:22.480 coding agent core team at GitHub. I'm
00:00:25.359 also the VP of the board of directors at
00:00:27.920 Ruby Central and I'm also a mom. Uh my
00:00:31.199 two kids are here today for watching me
00:00:33.440 talk for the first time. If so, if you
00:00:35.440 see them, say hi.
00:00:42.480 Thank you. Yeah, a little bit more about
00:00:44.640 Ruby Central. We're behind Ruby Gems and
00:00:47.039 many open source initiatives. Marty just
00:00:49.280 gave a lightning talk. He's our head of
00:00:50.879 open source about CRA, uh which was
00:00:53.680 great. And we also put on Ruby Comp.
00:00:56.239 We're about to announce where that is in
00:00:57.840 the US and when. We would love to see
00:00:59.440 you there. And also behind Exo Ruby that
00:01:03.280 Jim is doing, small local conferences
00:01:05.680 across the US. So be sure to check those
00:01:07.520 out too if you're in the US.
00:01:10.479 So today we're going to talk about what
00:01:12.320 agents are, how to build them, best
00:01:14.960 practices when building them, the future
00:01:16.960 of agents, and ethical considerations
00:01:19.600 when building agents as well. So when I
00:01:23.040 first thought about AI and co-pilot at
00:01:25.840 GitHub, I was on a different team. I was
00:01:27.360 on the deploys team at GitHub. I thought
00:01:29.200 I was going to be tuning LLMs. I wasn't
00:01:31.680 really sure what I was getting myself
00:01:33.360 into. Uh and when I joined the agent
00:01:36.079 team, it was a lot different than what I
00:01:37.759 was expecting. I was not tuning LLMs.
00:01:40.240 That's someone else. But Agentic AI is
00:01:43.360 really the new frontier. It's where
00:01:45.040 tooling and AI companies are headed. But
00:01:47.920 what are agents? What is Agentic AI?
00:01:50.320 What are workflows? Before we go any
00:01:52.479 further, we're going to do a little bit
00:01:55.439 of a vocab lesson so we're all on the
00:01:58.640 same page.
00:02:00.320 Agent AI at a high level is Oops, sorry,
00:02:03.759 I went too far. There we go. Uh, Agentic
00:02:07.119 AI, this is high level. The notion that
00:02:09.119 AI has agency, right? It's able to act
00:02:11.280 without being prompted. It makes steps
00:02:13.440 to achieve a go, a goal, and it acts in
00:02:15.760 a more humanlike way. Agents
00:02:18.560 specifically are the software systems
00:02:20.319 behind this. They can act autonomously.
00:02:22.720 They can make decisions. They can
00:02:24.480 understand context. And they can chain
00:02:26.879 tasks without being hardcoded to do so.
00:02:29.360 So a modern agent typically has a goal
00:02:32.160 or objective tools, a planner or
00:02:34.959 orchestrator, and long-term short-term
00:02:37.200 memory. But we'll get into those
00:02:38.480 building blocks more once we chat about
00:02:40.160 how to build these agents.
00:02:42.800 So what are workflows? We hear this term
00:02:44.879 a lot too. A lot of times workflows get
00:02:47.120 confused with agents. They're a key
00:02:49.120 concept in agents, but they're not the
00:02:50.720 same thing. They are similar and
00:02:52.879 workflows can be a part of an agent.
00:02:55.280 This is why agents just aren't chat bots
00:02:58.080 and it really turns them into structured
00:03:00.400 and more reliable processes.
00:03:03.519 So, this is a quick chart of what the
00:03:05.840 difference is between a workflow and an
00:03:07.360 agent. A workflow isn't as good at
00:03:09.920 learning or being flexible where an
00:03:11.760 agent is. Workflows are very regimented
00:03:14.720 and rule-based whereas agents are model
00:03:16.959 based. So workflows are really good when
00:03:19.360 you know exactly what steps you need to
00:03:21.120 take and agents are better at like
00:03:22.800 adapting and problem solving open-ended
00:03:25.040 problems.
00:03:26.640 So here is a highlevel overview of what
00:03:28.720 a workflow could look like. So you can
00:03:30.640 see we're just chaining LLM calls here
00:03:33.760 versus an agent. You can really see that
00:03:36.080 it's a circular action here. You know
00:03:38.959 human input that sort of thing action
00:03:41.040 feedback loop. But we'll get more into
00:03:42.720 this later. Just wanted to show you kind
00:03:44.400 of a high level the difference. So
00:03:47.040 workflows plus agents really make up
00:03:49.200 Agentic systems, which is another term
00:03:51.200 that you may have heard. So I could talk
00:03:53.760 about workflows all day, too. There are
00:03:55.360 best practices for those, but we only
00:03:57.200 have 30 minutes. So we're going to focus
00:03:58.799 on agents for open-ended problems where
00:04:01.519 it's really hard to predict the number
00:04:03.360 of steps that you're going to take. And
00:04:06.239 you know there's going to be a lot of
00:04:07.439 turns, a lot of you have to have some
00:04:09.760 level of trust in these agents because
00:04:11.760 they have a lot of autonomy. Um but
00:04:14.000 they're really ideal for scaling tasks
00:04:16.160 in trusted environments.
00:04:18.959 So because they're autonomous, higher
00:04:21.840 costs, right? Potential for compounding
00:04:24.160 errors. So extensive testing in sandbox
00:04:26.800 environments is essential. But again,
00:04:28.800 we'll get into that more when we talk
00:04:30.160 about best practices. But agents are
00:04:32.880 honestly really really exciting. There
00:04:35.600 really is a big paradigm shift from
00:04:37.840 imperative programming where we tell
00:04:40.160 software exactly what to do to
00:04:42.400 declarative goal setting where we define
00:04:44.560 objectives. And there are different
00:04:47.280 types of agents which you may have heard
00:04:48.880 of. There are learning agents and they
00:04:51.680 kind of improve over time learning from
00:04:53.840 experiences, learning from what you put
00:04:56.240 in, learning from tools and adapting
00:04:58.000 their behavior. We also have utility
00:05:00.720 based which improve their performance
00:05:03.120 over time as well and more um and then
00:05:06.479 we also have modelbased which are using
00:05:10.560 I'm sorry I'm behind on my slides um so
00:05:13.919 utility based optimize actions based on
00:05:17.039 a goal where we use tradeoffs to
00:05:19.759 maximize overall happiness or
00:05:21.680 performance.
00:05:23.520 Now we have model based. We maintain an
00:05:25.759 internal model of the world to inform
00:05:27.840 decisions uh beyond just the immediate
00:05:30.000 inputs that we're getting. Goal based
00:05:33.440 which is where we set a defined goal and
00:05:36.080 work towards those. And then we also
00:05:38.960 have um simple reflex where we're just
00:05:42.160 getting the information as we go. X Y
00:05:45.280 doesn't rely on memory or anything like
00:05:47.120 that.
00:05:48.400 So we have a lot of different use cases
00:05:51.039 of how we use these agents in the real
00:05:52.960 world. The first one is customer
00:05:55.280 support, right? These make a really good
00:05:57.360 use case for agents. Same with security
00:06:00.000 threat detection. Travel concierge is
00:06:02.720 another good real world application for
00:06:04.400 an agent. And then of course coding
00:06:06.639 agents. Coding agents can really take a
00:06:09.199 mundane task and really do awesome
00:06:12.479 things for you. It can create a PR for
00:06:14.080 you. And I want to show you briefly what
00:06:16.880 we've been working on at GitHub on our
00:06:19.280 coding agent team.
00:06:24.400 I like it.
00:06:28.319 I love it. It's undeniable.
00:06:32.800 I see it. I want it. I get it.
00:07:07.440 I want it. I get it.
00:07:10.720 Okay, I promise that's as like yay,
00:07:12.160 GitHub as I'm going to get, but I just
00:07:14.080 want to show you really quick.
00:07:15.199 Basically, you know, you assign co-pilot
00:07:17.039 pilot to a PR or to an issue and it
00:07:19.199 creates a PR for you and you review it.
00:07:21.360 And uh you can also see the difference
00:07:23.520 here of the agent actually writing the
00:07:25.520 code um versus you just kind of
00:07:27.840 prompting co-pilot like, hey, how do I
00:07:29.680 do this? So, really taking that next
00:07:31.680 step forward. Uh but now we're going to
00:07:33.440 talk about building agents, get into the
00:07:35.520 meat of the talk. Um, you can do a bunch
00:07:39.280 of different things with it, right? So,
00:07:41.039 you can do single or multi- aent. Today,
00:07:43.680 we're only going to focus on building a
00:07:45.680 single agent, not multi-. And here are
00:07:49.280 the kind of essential building blocks or
00:07:51.599 common building blocks that we have when
00:07:53.840 building agents. And we're going to dive
00:07:55.759 into kind of each of these pieces. And
00:07:58.240 to really solidify these concepts, I um
00:08:02.400 built a in a a Rails application like a
00:08:04.879 little dummy support bot or support
00:08:07.120 agent. I shouldn't say bot. It's more
00:08:08.960 than that. It's a support agent and the
00:08:10.879 example is all in Rails. So, we'll dive
00:08:12.720 into those and each of these examples.
00:08:16.479 So, we have the tooling layer. For this,
00:08:19.440 I definitely recommend using MCP. I'm
00:08:22.319 not sure if you've heard of it, but it
00:08:23.680 stands for model context protocol. It's
00:08:26.080 a standard that defines how uh AI models
00:08:29.120 securely discover and call external
00:08:31.360 tools, sources, services, and structured
00:08:34.640 JSON interfaces. This comes from
00:08:36.560 Anthropic. And tomorrow, same time, same
00:08:39.760 room. Paul is going to be giving a whole
00:08:41.919 talk on MCP. So, I highly recommend
00:08:44.720 going to this talk. So, I'm not really
00:08:46.480 going to go into it because he's going
00:08:48.000 to do a whole deep dive there. But I
00:08:49.920 briefly just want to show you what it
00:08:51.600 looks like in my application where I am
00:08:54.640 calling the Zenesk MCP server to get
00:08:58.080 context for my agent to know about the
00:09:00.640 support tickets that I'm getting. But
00:09:02.320 again, more on this tomorrow. You should
00:09:04.160 definitely check that talk out. Now,
00:09:06.399 diving into memory, we have both short
00:09:08.720 and long-term memory. But let's look at
00:09:11.440 long-term memory first. So agent memory
00:09:14.720 is long-term memory here. It's just a
00:09:16.959 database table with RO uh content and
00:09:20.480 run ID. And after every planner
00:09:22.720 response, tool call or user input, you
00:09:25.120 just drop a row in agent memories. This
00:09:27.760 creates a timeline of the agents
00:09:29.519 thoughts and context. And before asking
00:09:31.920 the LLM planner for the next step, you
00:09:34.080 can call recall to fetch the recent
00:09:36.959 context. This prevents forgetfulness too
00:09:39.440 and longunning sessions.
00:09:42.240 Next, uh we'll talk about state store.
00:09:44.320 This is short-term memory. It's
00:09:46.240 essentially the single source of truth
00:09:47.760 for an agent's run or current working
00:09:50.399 state. Think of it as the
00:09:51.920 orchestration's layer memory. Um, it's
00:09:54.800 not long-term memory, not logs. It
00:09:57.120 tracks everything the planner and the
00:09:58.720 agent needs to know to continue
00:10:00.640 execution. It has run metadata, current
00:10:03.680 plan, tool outputs, error retry, gate
00:10:06.720 decisions, things like that. This is
00:10:08.880 important for resumability if something
00:10:11.040 goes wrong for concurrency and also um
00:10:14.399 just knowing you're making the right
00:10:15.680 choice, right? Hopefully. So here we
00:10:18.880 have at run um at run is our state
00:10:22.000 store. It is a JSON B column. It holds
00:10:25.600 all runcoped working data. We have
00:10:28.240 pointer and we have set pointer here to
00:10:30.079 give you a clean way to store simple
00:10:31.839 cursors like which step you're on or the
00:10:34.880 last tool used. And we also have
00:10:37.040 statuses to know if the agent should
00:10:38.800 keep processing, pause for human review
00:10:41.360 or stop.
00:10:43.519 Then we have agent orchestration, which
00:10:45.200 is a really big part of building agents,
00:10:46.959 right? It's the logic that plans,
00:10:49.279 executes, and supervises multi-step
00:10:51.519 agentic work. It calls our tools or the
00:10:54.640 LLM. It implements guard rails. It
00:10:57.519 persists state. It can also do retries
00:10:59.519 and roll backs. It observes and audits.
00:11:02.079 It's really the glue between LLMs and
00:11:04.240 the business outcomes.
00:11:06.240 And here is the code for this. We're
00:11:08.959 asking the planner to uh for the next
00:11:11.680 action in strict JSON of course. Uh we
00:11:14.720 store the raw raw plan as a workflow
00:11:16.959 step. And then we gate the plan and do a
00:11:19.519 quick policy schema check. You know,
00:11:21.440 making sure does the user have access to
00:11:23.680 to this? Do they have any more, you
00:11:25.279 know, calls left to the LLM? Things like
00:11:27.680 that. We act again using MCP as our tool
00:11:30.399 and MCP server and persist that as a
00:11:32.959 step and accumulate state there. We then
00:11:35.600 merge the tools JSON and result back
00:11:37.600 into our run context JSON um so the next
00:11:41.200 plan can see it and then if the planner
00:11:44.720 returns finished we lightly gate the
00:11:46.399 final answer and return the run.
00:11:49.440 So planning planning is a really another
00:11:51.760 essential step for building agents. It's
00:11:53.680 the executive function of an agent. It
00:11:55.760 turns a goal into a sequence of steps
00:11:58.240 and verifiable actions. Um in modern
00:12:00.880 aentic AI we have four tactics usually
00:12:03.839 that we implement for this. We have sub
00:12:07.120 subgoal composition which splits a big
00:12:09.519 goal into smaller steps exactly as it
00:12:11.440 sounds and the planner proposes the next
00:12:13.440 action. We also have reflection. It
00:12:16.320 takes a look at the latest output and
00:12:17.920 then decides how to improve or what to
00:12:19.680 try next. We also have self-critique. It
00:12:23.200 scores an immediate or final result
00:12:25.120 against criteria and then revises it or
00:12:27.440 escalates it.
00:12:29.440 Um and then we have in chain of thought
00:12:32.079 which is internal reasoning of literally
00:12:34.720 what you're going to do next. Should you
00:12:36.240 persist? Should you not? Different
00:12:38.000 things like that. So those are the four
00:12:40.079 the four different um tactics that we
00:12:42.959 have for planning. And here in our
00:12:46.399 application, this is an example how the
00:12:48.720 planner returns strict JSON either use
00:12:51.760 tool or finish. We then persist the
00:12:54.480 decision um not the model's reasoning.
00:12:57.680 And here I'm using the OJ gem which is
00:13:00.560 the optimized JSON gem. In case you were
00:13:02.399 wondering what uh OJ is here, that's
00:13:04.320 what I was using.
00:13:06.800 And then the orchestrator uses it. Right
00:13:11.519 here it is kind of deciding do we use a
00:13:14.160 tool? Do we finish based on the plan's
00:13:16.240 action?
00:13:17.760 All right. Now we've talked about all
00:13:19.519 the essential building blocks and kind
00:13:21.040 of what that looks like in our
00:13:22.399 supportbot agent application. We're
00:13:24.720 going to talk about best practices when
00:13:26.720 it comes to building these agents. And
00:13:28.399 these are all based on things that we
00:13:30.160 have learned on the uh agent team at
00:13:33.200 GitHub. So I want to share those with
00:13:34.639 you. The first one is modular and
00:13:36.880 maintainable. I feel like this is a
00:13:38.639 given because other types of shop
00:13:40.160 software should be like this. But I feel
00:13:41.920 like it's even more important when
00:13:43.360 building agents to be honest. They're
00:13:45.360 complex. They're tightly coupled all of
00:13:47.760 these building blocks and you don't want
00:13:49.600 to make one small change and everything
00:13:51.440 breaks. Also things are changing
00:13:53.680 rapidly. Uh tooling, strategies,
00:13:56.959 everything changes every few months. So
00:14:00.079 we want to make sure that we can keep
00:14:01.920 iterating constantly because it's
00:14:03.760 essential or else you're totally going
00:14:04.959 to lose the game, right? Everyone's
00:14:06.320 moving so fast. So it's really important
00:14:08.480 that you make these modular and
00:14:09.920 maintainable so you can continue to
00:14:11.760 iterate and kind of keep up with the
00:14:13.519 times, right? Also, gates. I've talked
00:14:16.959 about this a little bit as we've gone
00:14:18.560 through these building blocks. They're
00:14:20.720 essential. a gate and a guardrail,
00:14:22.880 whether it's policy checks, rate limit
00:14:25.360 checks, or even just asking the human,
00:14:27.519 for example, in the coding agent, it
00:14:30.399 does it asks you if you want to keep the
00:14:32.639 code that it has written. So you either
00:14:34.160 can say keep or undo. So it's just
00:14:36.079 really important that you're adding
00:14:37.760 these gates. Um, and here's an example
00:14:40.800 in our application of what that looks
00:14:42.720 like. And our common RX schema is a
00:14:46.079 validation contract. I'm using dry
00:14:47.920 schema gem. Um, it sure it ensures any
00:14:51.040 agent requests to create a Zenesk ticket
00:14:53.600 includes certain things that I need.
00:14:56.959 Also, prompt changes. Uh, we keep prompt
00:14:59.680 changes and guidance minimal. When I
00:15:02.079 actually first went to make a prompt
00:15:04.399 change in our runtime, I had this huge
00:15:07.199 paragraph of text and everyone laughed
00:15:08.880 at me and was like, wait, no, you need
00:15:10.399 like one sentence. Um, a really harmless
00:15:13.360 tweak can cause the planner to stop
00:15:15.680 using tool tools. it should hallucinate
00:15:18.160 new actions or or break JSON contracts.
00:15:21.199 So, it's really important that you keep
00:15:23.279 these as minimal as possible to guide
00:15:25.839 kind of the agent where you want it to
00:15:27.680 go.
00:15:29.279 Also, extensive testing. Uh we do
00:15:31.519 extensive testing with nightly
00:15:32.959 evaluations that run every night. Uh
00:15:35.600 it's really important and there's
00:15:37.040 actually going to be another whole talk
00:15:38.800 on this. This is tomorrow at 3:45.
00:15:41.839 Andrew and Charlie from Shopify are
00:15:43.760 going to be talking about LLM
00:15:45.360 evaluations and reinforcement learning.
00:15:47.600 So definitely recommend that you check
00:15:49.120 that talk out as well.
00:15:51.600 Test harnesses. These are really
00:15:53.199 important. They allow us to lock in
00:15:54.959 behavior, tweak prompts, and also drive
00:15:57.680 down our number of tool calls. And
00:15:59.920 observability.
00:16:01.519 Everything the agent need everything the
00:16:04.240 agent does needs to be tracked in
00:16:06.720 inspected and explainable. A lot of the
00:16:09.440 building blocks we went into like memory
00:16:11.519 are essential for observability and
00:16:13.600 ensuring that the agent remains ethical
00:16:15.360 and dependable.
00:16:18.399 So uh here just really quick in the
00:16:20.639 co-pilot coding agent you can click view
00:16:22.959 session once co-pilot creates a PR for
00:16:25.279 you and you can see and track literally
00:16:28.560 everything that it did and thought and I
00:16:31.360 really think this is important for
00:16:32.639 accountability and things like that.
00:16:36.240 So, you may be wondering what the future
00:16:38.720 of agents look like. We have a really
00:16:40.240 good idea of where we are, what these
00:16:42.399 look like now, but I really think that
00:16:45.839 agents will continue to be more
00:16:47.839 humanlike. I don't know if anyone has
00:16:49.680 heard of the word anthropomorphization.
00:16:51.519 I can barely say it. Uh, but basically,
00:16:53.759 it's the attribution of human traits,
00:16:55.759 emotions, or attention intentions to
00:16:58.000 non-human entities. Um, and how those
00:17:01.440 apply to humans. So making things that
00:17:03.440 are not human more humanlike and I think
00:17:06.319 it will become more about how the agent
00:17:08.400 feels to you. what is the personality of
00:17:10.880 the agent? Um, building relationships
00:17:13.600 with these agents. As me memory and
00:17:16.000 things like that improve, they're really
00:17:17.360 going to get to know you and how you
00:17:18.959 write code and things that you value
00:17:21.199 and, you know, become more ethical,
00:17:23.039 dependable, and align to a brand. And I
00:17:25.679 think voice, tone, values, consistency,
00:17:28.559 things like that will be just as
00:17:30.160 important as what the agent is doing
00:17:32.160 itself.
00:17:33.679 Also, sub aents are a big thing. Uh
00:17:35.840 we're working on this currently at
00:17:37.520 GitHub on our coding agent, but
00:17:39.919 basically we're going to see hierarchies
00:17:42.240 of sub aents, you know, kind of like SRP
00:17:44.320 that we all know and love, Sandy Mets.
00:17:46.480 Uh where each agent has its own
00:17:48.480 responsibility. So for example, our
00:17:50.960 support agent um our support agent aid
00:17:54.640 our support agent can look something
00:17:56.320 like this. The architecture here. So you
00:17:58.960 can see each has its own responsibility
00:18:01.440 instead of just one big monolithic
00:18:03.760 agent. So kind of breaking that up so
00:18:07.200 each does its own thing. Also workflow
00:18:09.840 native. Right now we built workflows
00:18:12.240 around an LLM. Future systems will
00:18:14.559 natively model workflows um branching
00:18:17.760 retries checkpoints and roll backs. This
00:18:19.919 is where agent orchestration will become
00:18:21.919 central. It looks more like a chat loop
00:18:24.240 instead of a chat loop. It's more like
00:18:26.000 workflow engines with LLM steps inside.
00:18:28.799 So I'm thinking maybe a blend of
00:18:30.559 traditional orchestration tools like
00:18:32.320 Airflow or Temporal. if you've ever used
00:18:34.320 any of those with LM aware adapters.
00:18:39.280 So yes, agents and the future of agents
00:18:42.080 are really exciting, but we really have
00:18:44.080 to remember that Agent solutions are
00:18:46.720 often an overengineered solution. A lot
00:18:49.600 of times I personally find myself
00:18:51.520 working or wanting to reach for the
00:18:53.280 shiny new toy. Yes. Um but instead using
00:18:56.960 that in a side project versus production
00:18:58.799 might be better. you know, building an
00:19:00.400 agent that you need in your life and
00:19:02.640 learning all of these concepts on your
00:19:04.320 own. So, it's really about choosing the
00:19:06.400 right tool that you need for your
00:19:08.000 problem at hand.
00:19:10.400 And then really quick, we're going to
00:19:12.320 talk about kind of ethical
00:19:14.080 considerations of agents. Transparency.
00:19:17.039 We've talked a lot about transparency,
00:19:18.799 observability, different things like
00:19:20.320 that that we really need to be
00:19:22.720 transparent with what our agents are
00:19:24.480 doing and make sure the user is aware
00:19:26.240 too and include them as you develop. Um,
00:19:29.280 you can also, you know, steer co-pilot
00:19:31.280 and different things like that. I think
00:19:32.400 that's important. Also, biased. We
00:19:34.960 really need to be questioning the
00:19:36.320 algorithms that or I'm sorry, the LLMs
00:19:38.559 that we're using. Are they biased LLM?
00:19:40.799 I'm not going to name names, you know,
00:19:43.440 the LLMs that I think have issues, but I
00:19:46.080 think we need to be aware of those. And
00:19:47.840 also dog fooding. So, we do a ton of dog
00:19:50.160 fooding on our team to use it because we
00:19:52.799 want to make sure that, you know, our
00:19:54.880 agent is doing what we expect it to do
00:19:56.559 and is not biased. safeguards, gates,
00:19:59.679 and policy checks. It's really easy to
00:20:01.440 build an agent without these, but I
00:20:03.280 really think these are essential for
00:20:04.799 ensuring that they are on track. Also,
00:20:07.919 security and retention policies. Make
00:20:10.160 sure you have those set on your team if
00:20:12.000 you are building an agent easily visible
00:20:14.960 to developers, things like that. And
00:20:17.360 like I mentioned, audits, dog fooding is
00:20:19.520 great. Make sure that you are doing all
00:20:21.600 these things on a regular basis.
00:20:24.000 And as far as Ruby and Rails tooling,
00:20:26.480 we've been a little bit slower to adopt.
00:20:28.960 A lot of the major SDKs and tooling have
00:20:31.039 been in Typescript and Go and other
00:20:33.280 languages. And we need to ensure that
00:20:36.159 Rails remains relevant. So I would love
00:20:38.640 to talk with you all on how we do that
00:20:40.320 and how we continue to be a major player
00:20:42.880 and keep up in this space. Also, we want
00:20:46.240 societal impact to be positive and not
00:20:48.640 negative. I've heard a lot of talk of
00:20:50.640 people saying that they want to replace
00:20:53.120 developers, but I think it should really
00:20:55.360 we should really focus on augmentation
00:20:57.679 and accelerating versus replacing. We
00:20:59.919 shouldn't be using that word. And it's a
00:21:02.480 fundamental evolution on how we build
00:21:04.799 software systems. We can move beyond
00:21:07.039 traditional programming paradigms
00:21:08.799 towards system that reason, learn, and
00:21:11.360 adapt to changing conditions. We now
00:21:13.919 have powerful capabilities that can
00:21:15.760 dramatically change our work. And for
00:21:17.919 me, that's really important because I
00:21:19.679 can focus on the thing that matters. I
00:21:21.440 can spend more time with my kids.
00:21:24.559 But how can we build agents for good?
00:21:27.600 How can we make these a powerful thing
00:21:30.080 in our society that are bringing
00:21:31.840 positive impact and not negative impact?
00:21:34.640 How can agents make our world better,
00:21:36.799 but also really important, how can it
00:21:39.120 make your world better?
00:21:41.760 Thank you.
Explore all talks recorded at Rails World 2025
+19