Summarized using AI

The ActiveRecord Tapes

Tess Griffin • July 10, 2025 • Philadelphia, PA • Talk

Introduction

In the RailsConf 2025 talk "The ActiveRecord Tapes" by Tess Griffin, attendees are given a behind-the-scenes look at the development and refactoring of the Attributes API within Rails' ActiveRecord. The speaker, Tess, shares a detailed story that includes direct insights from Sage Griffin, the key contributor responsible for overhauling type casting within ActiveRecord, making it more stable, maintainable, and easier to work with for developers and gem authors alike.

Key Points

  • Background and Motivation:

    • Tess introduces the motivation for rewriting ActiveRecord’s type casting system, using the example of setting and retrieving a user’s ID attribute.
    • The core issue was that gem authors who needed custom type casting (e.g., for encrypted attributes) were forced to monkey-patch unstable and undocumented internals, resulting in unreliable integrations.
  • The Old System:

    • Type casting logic was spread across numerous modules, with overlapping methods (like write_attribute) defined in multiple places.
    • Debugging the spaghetti code in Rails 4.1 was extremely challenging, as the logic for attribute handling was difficult to trace and reason about.
  • The Refactoring Journey:

    • Sage Griffin’s initial attempts involved a massive pull request, quickly recognized as too large and complex to merge.
    • Through a series of smaller refactors—guided by maintainers and regular communication—the internals were redesigned to use objects (attribute and type objects) that encapsulated both data and behavior, leading to more Ruby-idiomatic code.
    • The new design centralized the type-casting logic, making it much easier to maintain, test, and extend, and transformed debugging from a daunting task to a straightforward process of tracing type objects.
  • Technical Tradeoffs and Challenges:

    • The move to an object-oriented design introduced a performance overhead, initially resulting in a high number of Ruby object allocations (e.g., one per column per record).
    • Performance was improved by lazy-instantiating objects only when necessary, reducing memory allocations significantly.
    • The refactor prioritized maintainability and code quality over raw performance, based on the principle that maintainable code is less likely to be buggy and easier for the whole community to contribute to.
  • Broader Impact and Lessons Learned:

    • The refactor eliminated classes of bugs and made upgrading Rails less painful for users.
    • Sage emphasized writing clear commit messages and leaving contextual breadcrumbs for future maintainers.
    • Practical advice for contributors includes respecting contribution guidelines, communicating with maintainers, and favoring incremental, readable changes.
    • Tools and communities like CodeTriage were highlighted as ways to get involved in open source.

Conclusion and Takeaways

  • The Attributes API rewrite made ActiveRecord more robust, maintainable, and adaptable, significantly improving the developer experience.
  • Key themes included the importance of code maintainability, clear communication in contributions, and the value of incremental changes in large open source projects.
  • Sage Griffin's work underscores how thoughtful refactoring and community-driven improvements can have lasting positive impacts on widely used software.

The ActiveRecord Tapes
Tess Griffin • Philadelphia, PA • Talk

Date: July 10, 2025
Published: July 23, 2025
Announced: unknown

In this talk, we'll dig deep into the design decisions of the Attributes API. We'll talk to its author and hear their experience from "I think this would help with the current project I'm on!" to "Oh no, I'm rewriting all of ActiveRecord!".

This will be a peek under the hood to a foundational part of ActiveRecord. You'll learn about the difficult technical constraints and tradeoffs that were made in development. Even if you don't know anything about the Attributes API, if you've used ActiveRecord, you've used it! No prior knowledge needed.

RailsConf 2025

00:00:16.880 Um, hi everybody. Thank you so much for
00:00:19.199 coming today. My talk is an active
00:00:21.760 record rewrite, the story behind the
00:00:23.920 attributes API. My name is Tess.
00:00:27.920 Today we have a story about refactoring
00:00:30.640 a large part of active record to make it
00:00:33.280 more stable.
00:00:35.440 First we'll have this user class.
00:00:38.960 This is its table. It has an attribute
00:00:42.480 ID whose type is integer.
00:00:46.320 Now we'll have an instance of user. We
00:00:49.120 are setting its attribute to the string
00:00:51.840 one.
00:00:53.360 Now let's call ID on the instance of
00:00:55.520 user and see what's returned. You might
00:00:58.000 expect that calling it again would
00:00:59.520 return the same thing. The string one I
00:01:02.000 mean we just set it to be a string but
00:01:04.559 we get back an integer. Its previous
00:01:07.040 type string was cast to integer.
00:01:11.200 This is type casting turning the data
00:01:13.520 from one type to another and vice versa.
00:01:17.200 When we look back at our table we can
00:01:19.200 see that the type for the attribute ID
00:01:22.479 is integer. That's why when we pass it
00:01:24.960 the string, Rails cast its type to an
00:01:28.400 integer. Now, how does Rails handle type
00:01:31.759 casting?
00:01:33.280 In Rails, the attributes API handles
00:01:35.680 type casting in Rails. Today, the
00:01:38.240 attributes API has been extracted to
00:01:40.560 active model, but it started off an
00:01:42.799 active record. For today's dive into the
00:01:45.360 attributes API, I sat down for a
00:01:47.840 conversation with its author, Sage. Um,
00:01:51.759 well, they're actually here and they're
00:01:53.119 not a plant. So, uh, Sage Griffin is my
00:01:56.479 wife and during the early years of our
00:01:58.560 marriage, they got into open source
00:02:00.640 contributing and refactored so much of
00:02:02.960 active record that I wanted to talk
00:02:04.799 about it. For context, this was like 10
00:02:08.879 years ago. Um, took place in general
00:02:11.039 between 2014 and 2016. Um, the major
00:02:15.440 internals refactoring was in Rails 4.2 2
00:02:19.680 and then the public API was released in
00:02:22.080 Rails 5. So my first question to Sage
00:02:24.800 was how would you describe the
00:02:27.280 attributes API? And they said it's the
00:02:30.640 code that does the types
00:02:33.920 like okay pretty straightforward. Um and
00:02:37.280 then I asked them well what is type
00:02:38.800 casting? And they said when active
00:02:41.599 record receives your data it's almost
00:02:43.760 certainly going to be as a string. You
00:02:46.080 as the developer don't want a string.
00:02:48.560 You want an integer or date or whatever
00:02:51.519 useful Ruby object that data is
00:02:53.840 representing. The attributes API and
00:02:56.640 active record will deal with converting
00:02:58.720 from your Ruby objects to those string
00:03:01.280 representations and vice versa.
00:03:04.480 So I asked them where did the
00:03:06.319 inspiration come from? And they said
00:03:09.040 well we were on a project and had an
00:03:11.519 attribute that needed to be encrypted.
00:03:14.000 We wanted it to be that whenever you
00:03:16.239 accessed it, you got the decrypted
00:03:18.400 version and vice versa. We were using a
00:03:21.280 gem and I looked at the gem's code to
00:03:23.519 see how it was doing that. The gem was
00:03:26.239 pretty buggy, but I discovered that
00:03:28.080 looking at the code, how much of the
00:03:30.159 Rails internals it had to monkey patch
00:03:32.560 to do this work. And that's not the
00:03:34.400 gem's fault.
00:03:36.560 The gem had to monkey patch all these
00:03:38.799 different code paths. These are Rails
00:03:41.519 internals, so they are undocumented and
00:03:44.319 interact in subtle ways. This is very
00:03:46.879 difficult to get right.
00:03:49.519 Supporting multiple versions of Rails
00:03:51.680 for gem authors becomes difficult or
00:03:54.560 impossible when you're monkey patching
00:03:56.799 internals that aren't stable. It's like
00:03:59.439 building on top of shifting stand.
00:04:01.680 Because they are internal, they can
00:04:04.000 change without deprecation warnings in
00:04:06.480 between versions. Sage said, "I thought
00:04:09.200 that there had to be a better way. I
00:04:11.360 said that they sounded like an
00:04:13.200 infomercial."
00:04:14.879 They said, "Yes, exactly. I was the
00:04:17.199 person in Grayscale doing things the old
00:04:19.519 hard way. I was the person who had to
00:04:22.240 This is a a person cutting up bread with
00:04:25.600 a doors stop." And uh um yeah,
00:04:30.000 infomercial commercials are weird. Um
00:04:32.800 there had to be a better way.
00:04:35.919 Rails should make writing this gem
00:04:38.000 easier. Gem authors shouldn't have to
00:04:40.880 monkey patch all of these undocumented
00:04:43.759 Rails internals to get type casting of
00:04:47.120 encrypted attributes to work. So how did
00:04:50.160 type casting work before? Sage said well
00:04:54.560 poorly.
00:04:56.639 Um this was our example of type casting
00:04:59.600 taking the string one and casting it to
00:05:02.000 the integer one. Typ casting has three
00:05:05.280 general areas of work that it does.
00:05:07.600 Writing, reading, and saving. To show
00:05:10.720 how active record handled typ casting
00:05:12.720 before sages refactor, we're going to
00:05:15.199 try to follow the code path for writing
00:05:17.360 the ID attribute. It's going to be
00:05:19.360 totally fine. Trust me.
00:05:22.479 Okay, so don't mind the module counter
00:05:26.080 here. Um, we're starting off in write
00:05:28.080 RB, our first module. Um, write RB takes
00:05:32.000 the attribute name and then the value
00:05:36.639 which would be the ID and the string
00:05:38.800 one. Oh, wait, hang on. Sorry. Our write
00:05:41.440 attribute gets overwritten immediately
00:05:43.600 in a different module dirty. So in dirty
00:05:46.720 marbby our second module we have write
00:05:49.440 attribute again. Uh, we do some things
00:05:52.080 call super cool. So that takes us back
00:05:55.520 to write RB.
00:05:57.600 Now we have to resolve type cast
00:05:59.840 attribute for write. Okay, let's see its
00:06:02.400 definition. Oh wait, sorry, hang on. Um,
00:06:05.520 type cast for type cast attribute for
00:06:08.479 write also gets overwritten in a
00:06:10.639 separate module serialization.
00:06:13.280 Um, so this is just we're running you
00:06:15.919 through three different modules and we
00:06:17.919 go back to write again and then just
00:06:21.520 yeah, no, this is it's not great. Um, to
00:06:24.960 be clear, we're not even close to being
00:06:26.639 done with writing yet, and we've already
00:06:29.039 gone through three different modules.
00:06:32.160 This is what debugging in 4.1 felt like
00:06:35.039 when trying to debug this type casting
00:06:39.039 behavior.
00:06:40.880 So, these are just all of the modules
00:06:43.600 that could possibly contain overrides
00:06:46.160 and methods we're working with.
00:06:48.560 The logic was so spread out in these
00:06:50.400 modules, the behavior leaked everywhere.
00:06:52.639 It was way too much to hold in your
00:06:54.160 head. Sage said, "Which of the six
00:06:57.199 places that define the same method is
00:06:59.440 the one that actually has the bug?"
00:07:02.880 These methods were a nightmare to deal
00:07:04.560 with. Sage said that they were all sort
00:07:06.720 of ad hoc and spread across the modules.
00:07:10.560 You would have like six different
00:07:12.560 modules all define the same method. It
00:07:16.000 was spaghetti code. If each noodle was a
00:07:19.440 module and they're all lumped together,
00:07:22.400 eventually you resolved to two hashes
00:07:27.120 before typcast and after typcast. You
00:07:29.919 can see that in the before typcast hash,
00:07:33.360 the key is the string one. Sage really
00:07:35.440 wanted me to let you know that this the
00:07:37.280 keys are strings. Um that's very
00:07:40.080 important apparently. uh and the value
00:07:42.240 is the string one and then after type
00:07:44.880 cast it's the integer one. So it boiled
00:07:48.560 down to just two hashes but the road to
00:07:50.319 get there was pretty hard to parse.
00:07:54.400 This is stage first pull request to
00:07:56.560 Rails in the spring of 2014. There were
00:08:00.160 37 files changed. Their intention was to
00:08:04.879 just
00:08:06.400 refactor typ casting behavior. The
00:08:09.039 problem was another person was already
00:08:11.280 doing this work in active record. This
00:08:14.240 is what it felt like to review,
00:08:18.319 just a bit overwhelming.
00:08:21.440 Um, the maintainer of active record
00:08:23.280 wrote, "Please don't add more stuff to
00:08:25.039 this PR. Please don't." Um, Sage was
00:08:28.400 only trying to refactor how typ casting
00:08:30.479 worked in active record, but like how we
00:08:32.959 talked about earlier, its behavior
00:08:34.880 leaked everywhere. This first pull
00:08:37.200 request was way too big and way too
00:08:39.200 complex.
00:08:41.039 Soon after, Sage was invited to base
00:08:43.200 camp where the maintainers of Rails
00:08:44.880 chatted about working on the project.
00:08:46.959 Sage worked through that summer in 2014
00:08:49.600 and into the fall making small refactor
00:08:52.080 after small refactor.
00:08:54.800 And I asked Sage what was their
00:08:57.200 northstar? What were they trying to
00:08:59.680 refactor all this leaky code into? and
00:09:02.959 they said that they were refactoring to
00:09:05.680 something that managed state internally
00:09:08.959 in a way that you would expect in Ruby.
00:09:11.920 This means that instead of code spread
00:09:14.160 out all over the place, they wanted
00:09:16.399 objects, instance variables and methods
00:09:18.959 that encapsulated the behavior.
00:09:22.000 So remember our user example
00:09:25.200 in the refactor, you have the new
00:09:28.160 attribute object and new type objects.
00:09:31.440 You can see here in from user we're
00:09:33.839 taking the value which is the string one
00:09:36.320 and this new type object the type active
00:09:40.080 model type integer. The attribute object
00:09:43.920 holds the state and then it asks the
00:09:47.519 type object given to it how to do the
00:09:50.399 conversion to and from its type.
00:09:54.080 Put simply, the spaghetti code got put
00:09:56.240 into a box. The box being the attribute
00:09:59.120 object.
00:10:01.200 The attribute object can now ask, "Hey,
00:10:03.519 datetime, turn my string into one of
00:10:05.760 you." These type objects now hold the
00:10:08.640 logic for ta for casting. And the
00:10:11.680 attribute object delicates to them.
00:10:15.120 Debugging went from digging through
00:10:17.120 spaghetti to oh, we probably just used
00:10:19.839 the wrong type object. Cool.
00:10:22.880 Sage said that the code got
00:10:24.720 rearchitected into this more generic
00:10:27.120 way. Entire classes of bugs became just
00:10:30.000 impossible.
00:10:31.839 So for example, this one relatively
00:10:34.399 small PR just closed all of these issues
00:10:37.839 just by itself from this one refactor.
00:10:41.120 In another example of bug fixing, Sage
00:10:43.839 mentions how this feature just like
00:10:45.839 works now. So, if you were upgrading
00:10:49.200 your Rails project from 4.1 to 4.2 or
00:10:53.120 the beta for 5.0, you might have just
00:10:55.920 started having things work now. You
00:10:57.360 didn't even know were broken.
00:11:00.160 I asked Sage, "What do you think was the
00:11:02.720 hardest trade-off you had to make in
00:11:04.160 this refactor?"
00:11:06.000 Initially, making the code maintainable
00:11:08.880 came at a performance cost.
00:11:11.839 So I previously mentioned how the
00:11:13.760 initial code resolved to two hashes. The
00:11:16.720 work in memory of allocating two hashes
00:11:19.200 is computationally small.
00:11:21.920 Depending on how many records you are
00:11:23.680 reading from the database, these
00:11:25.360 allocations can add up pretty quickly
00:11:27.440 when you're allocating Ruby objects
00:11:29.760 instead of just two hashes.
00:11:32.480 Allocating a lot of Ruby objects is
00:11:34.480 objectively going to do more work than
00:11:37.120 two hashes. So how many are we talking
00:11:40.000 about? So it's one attribute per column
00:11:43.839 per record. In the real world, your user
00:11:46.800 table is going to have a lot more than
00:11:48.480 one attribute. So let's say we have a
00:11:52.160 users table
00:11:54.480 and we're have an admin page and we're
00:11:57.120 displaying 20 users and let's say you as
00:12:00.959 a developer are making a query and
00:12:02.880 you're showing three attributes but
00:12:05.040 maybe you don't select just those three.
00:12:07.279 Active record will grab all of them. Our
00:12:09.760 user has 40 columns.
00:12:12.560 So 20 users times 40 columns is 800 Ruby
00:12:16.639 allocations, which is objectively a lot
00:12:20.240 more than two hashes.
00:12:22.880 Um, multiply this by all of the
00:12:25.040 different records you're having to query
00:12:26.399 for. This adds up quickly.
00:12:30.720 So one of the lowhanging fruits from
00:12:32.800 performance gains was instead of
00:12:34.399 allocating all the objects, changing the
00:12:37.040 code to only instantiating the objects
00:12:39.600 as needed.
00:12:41.519 For example, this would reduce our
00:12:43.440 example from 800 Ruby allocations to 60,
00:12:47.600 which is a lot better. This is that PR
00:12:50.959 Sage said that I remember spending
00:12:53.120 several weeks just working on
00:12:54.480 performance and we clawed it back to the
00:12:56.720 point where they don't think that 4.2
00:12:58.720 into had significant performance
00:13:00.959 regressions and they said work as
00:13:04.959 working on open source projects was hard
00:13:07.680 because Rails is dependent on by a lot
00:13:10.720 of different
00:13:12.720 a lot of different users and it's tricky
00:13:15.040 because there are various benchmarks
00:13:16.560 that they can run but every application
00:13:18.720 is going to be different so covering all
00:13:21.360 of those is really hard and so I'm
00:13:24.399 pretty sure at one of the Rails comps
00:13:27.200 before this was released They had the
00:13:29.120 beta out and stages were literally
00:13:30.720 running around everybody, please try the
00:13:32.800 beta. I will pair with you if your app
00:13:35.920 gets slower and they ended up pairing
00:13:37.920 with like a lot of people.
00:13:41.920 Um, I asked Sage for their thoughts on
00:13:44.000 the importance of maintainability.
00:13:46.560 They said that I'm of the opinion that
00:13:49.120 if the code is buggy and unmaintainable,
00:13:51.760 it doesn't matter how performant it is.
00:13:54.160 If you're doing the wrong thing fast,
00:13:56.160 you're still doing the wrong thing.
00:13:59.600 Code that is hard to follow will attract
00:14:02.160 more bugs. When you make the code easy
00:14:04.880 to reason about, you make it more
00:14:06.639 stable. In open source projects, you
00:14:09.519 have a lot of hands touching the
00:14:11.680 project. And the more hands, the more
00:14:14.720 bugs it will it'll attract. So in
00:14:17.600 general
00:14:19.519 said you should try to leave the code
00:14:21.680 better than how you found it for
00:14:24.000 yourself and for other people and other
00:14:26.480 people includes you six months from now
00:14:29.279 when you have no I when you do get blame
00:14:31.279 and you're like who wrote this? Oh it
00:14:33.360 was me it I'm the problem it's me.
00:14:37.920 Um
00:14:40.240 Sage is a big proponent of commit
00:14:42.560 messages as documentation.
00:14:45.040 Stage this did this a lot by making
00:14:47.920 their commit messages have all of the
00:14:50.079 context that they were thinking of at
00:14:51.760 the time that they wrote their pull
00:14:53.360 requests.
00:14:54.880 Um, this isn't like a example of that.
00:14:57.920 This is like way too large to fit on a
00:15:00.560 screen. Uh, but famously they had a code
00:15:04.000 change that was like two lines and it
00:15:06.399 was like 19 paragraphs. So the the ratio
00:15:11.279 of context did not always meet the code,
00:15:15.279 but writing a talk like this was really
00:15:18.240 only possible given past Sage's context
00:15:21.199 at the time they left behind as
00:15:22.959 breadcrumbs.
00:15:25.279 Finally, I asked Sage for their thoughts
00:15:27.040 about contributing to open source
00:15:28.639 projects ones like especially like
00:15:30.800 Rails, but they had contributed to other
00:15:32.480 open source projects such as crates.io.
00:15:36.959 For context, CH contributed ton rails
00:15:39.680 over the years that the this refactor
00:15:41.279 took place. I took the screenshot
00:15:43.360 recently and even though they haven't
00:15:45.600 contributed in a while, they're still
00:15:47.279 number 15 of all time. So, I'm pretty
00:15:50.160 proud of that for them.
00:15:52.720 Um, so stability is essential,
00:15:57.120 especially for open source projects that
00:15:59.279 a lot of people use. You don't want
00:16:02.000 upgrading Rails versions to be painful.
00:16:04.240 I think we've heard quite a few people
00:16:06.160 at this talk looking back on Rails
00:16:08.480 mentioning how painful it was to upgrade
00:16:11.440 projects from from two to three. And
00:16:14.800 around the time five came out, the core
00:16:17.120 team started taking stability much more
00:16:19.680 seriously than in previous versions. If
00:16:22.560 your public APIs aren't stable and
00:16:25.199 upgrading is painful, people tend to
00:16:27.680 stay on the old versions. Open source
00:16:30.160 maintainers want people to upgrade. If
00:16:32.399 there's an important security fix, you
00:16:34.399 don't want to have to go back to older
00:16:36.320 versions and fix it when like you've
00:16:39.279 already fixed it in the current version.
00:16:43.199 And then I asked Sage, did they have any
00:16:46.079 tips for like how to be a helpful
00:16:47.839 helpful contributor? And they said,
00:16:51.040 please don't just put an open issue into
00:16:54.079 Claude and then just like open a PR with
00:16:56.320 the results like it's gonna get closed.
00:16:59.279 I'm sorry.
00:17:01.279 Um, please read the contribution guide.
00:17:03.839 They've written it for a reason and
00:17:05.839 they're usually pretty great to help you
00:17:07.839 get started.
00:17:09.919 Contribute how the project would like
00:17:11.600 you to contribute because oftentimes
00:17:13.520 contributing to open source can be
00:17:15.760 things like triaging issues and
00:17:18.079 reproducing issues. Um, people working
00:17:20.959 on open source projects are just people.
00:17:23.600 So, it's really good to just talk to
00:17:25.280 them if you're interested.
00:17:28.000 Don't be like Sage. Don't come out of
00:17:30.160 nowhere with a giant pull request and
00:17:32.400 expect it to be merged because it
00:17:34.720 probably won't. Thank you, Ruby.
00:17:38.720 Write code that's meant to be read by
00:17:40.480 another human because it will be right
00:17:43.919 now. Um,
00:17:46.880 shout out to our friends project uh code
00:17:49.600 triage. you can sign up for it, pick
00:17:52.240 some open source projects and it will
00:17:54.000 send you issues that are beginner
00:17:56.160 friendly to work on if you want to get
00:17:57.600 started contributing to open source.
00:18:00.240 So I asked Sage if they had any final
00:18:02.240 thoughts and they said that they think
00:18:04.799 the most impactful work was refactoring
00:18:07.200 active record to make it easier to use
00:18:09.600 and they spent years of their life
00:18:11.760 working on this and I think it's really
00:18:13.760 cool to be able to celebrate that and
00:18:16.480 making the code more maintainable at the
00:18:19.039 last Railscom. So I'm I'm currently a
00:18:22.559 thbot. Um you can find me tessa
00:18:25.520 thoughtbot. Um, I just want to thank
00:18:27.520 them for letting me come here today and
00:18:29.360 talk about this. It It feels full circle
00:18:31.280 cuz when Sage uh actually worked on
00:18:34.080 this, uh, they actually worked at
00:18:35.360 Thoughtbot and they used Thbots's open
00:18:38.720 source um, their investment time to
00:18:40.880 actually do this rewrite and so it feels
00:18:43.120 really special. So, thank you everybody.
Explore all talks recorded at RailsConf 2025
Manu Janardhanan
Christopher "Aji" Slater
Hartley McGuire
Yasuo Honda
Ben Sheldon
Chad Fowler
John Athayde
Mike Perham
+77