00:00:18
Hi everyone. Before I start my talk today, I'm curious: who here works with Rails at work? Okay, most of us. That makes sense.
00:00:29
Put your hand up if you work with Rails 8. Oh, cool. Okay. Put your hand up if you work on Rails Edge. Yeah. Okay. Put your hand up if you work on Rails 7 or below. Okay.
00:00:42
Well, for everyone here, I'm hoping to convince you with this talk today to upgrade your Rails applications to Rails 8.
00:00:54
So, with that, I'll get started. bin rails server. Many of us have used this command before. The command boots the application and it attaches it to an application server, and this is what many of us associate boot time with — the application server starting up and eventually serving our app.
00:01:12
But it turns out a lot of commands do this: server, console, routes, test, generate — all of these will boot your application. And we use many of these every day, some more than others. But when your application is slow to boot, all of a sudden this can become quite painful.
00:01:26
And you might be thinking, okay, well, I can just use Spring, right? And Spring is a great gem, but it doesn't make your application boot faster. It makes your application boot less often. When you need to restart your computer, change configuration, or update a gem, you'll need to boot the application again.
00:02:09
And it was slowing down the Shopify monolith as well. Boot time was steadily increasing. People were waiting way too long, and something needed to change.
00:02:21
Hi, I'm Gannon. I'll be talking about some development speed optimizations in Rails that I made with my team last year. I'll be talking about file watching optimizations, routing optimizations, and initialization optimizations.
00:02:32
But before I do, I want to quickly talk about this command bin rails boot. bin rails boot was added to Rails in Rails 8.0 and it isolates the booting part of Rails into a singular command. If you run it, it appears to do nothing; it just boots the application and exits. But it has one notable use.
00:02:46
You can use it to run a boot time profile. So, you'll need a profiler first though in order to do that. I'll be using stack prof, but you can use something like verier as well. You can add stack prof via bundler to your application like this: bundle add stack prof. And if we add this snippet to config/environment.rb of our application, we should be able to profile boot time.
00:03:16
Let's walk through this code. First, we check for a STACK_PROF_MODE environment variable. Then we take that mode and check for a stack prof interval, defaulting it to 1000. Then we require stack prof and start it up. Note that stack prof is a sample-based profiler, so the interval tells us how often stack prof should be collecting data. You don't necessarily need it, but it's nice to tune the amount of data your profile is collecting. When the Ruby process is exiting, we stop stack prof, get the data, and then write it to a file. Now we can run bin rails boot with the STACK_PROF_MODE environment variable, specifying mode equals wall to denote a wall-time profile. This outputs a file to a Rails application temp directory.
00:04:17
Now we need a visualizer to use this file, so we can use SpeedScope. I'll do npm install speedscope here to get SpeedScope from npm. Then you feed it the profile data file from earlier. This is what a new Rails application looks like when you do that. If you're not familiar, this is called a flame graph. The graph has an X and Y axis, and SpeedScope inverts the Y-axis, so the data is top-down. The X-axis is time and the Y-axis is stack depth. We graph stack frames over time. We can segment the graph into two sections: the first is the application's gems being loaded; the more gems you have, the longer this section will take. Most applications spend the majority of their time here. The second section is Rails starting up. This is where Rails sets itself up and the initialization process happens. There's plenty to talk about in both sections, but today I'll focus on the second part and all the changes in Rails that we needed to make to make it faster.
00:04:56
Going back to the config/environment.rb file, we see initially that we have a require line and an initialize line by default. This reflects the two areas we saw on our flame graph: the gem require section and the Rails setup section. Also, if you require this environment file, it will boot the application as a result. On new Rails applications this is pretty quick, but on one of the largest Rails applications in the world, not so much. The first thing we noticed that was a little slow for us was file watching. We can visualize file watching with a diagram: we have our developer, who starts a Ruby process. The Ruby process watches a set of files, and the developer edits those files at some point. Once those edits happen, the Ruby process detects that files have changed and does something. This probably makes you think of autoloading. We watch Ruby files in development for autoloading purposes, but we also watch other kinds of files too, one of which is translations. Translations live in the config/local directory of your Rails application, and we use the I18n gem to interpolate them into views and other places. Multilingual applications can have thousands of translations, and gems can also add to the number of translations that an application has, so the time it takes to sort through these can quickly add up.
00:06:23
This is the code that powers local file watching in Rails. Let's walk through it. First, we grab the configured reloadable directories in Rails. Then we initialize a reloader, passing it the I18n load path and the directories. The I18n load path is a large array of translation files. When a reload event happens, it executes this block: we trim non-existent files from the I18n load path, ensure all reloadable load paths within Rails that exist are present in the I18n path, then execute a reload. We made a few changes to this code, some obvious, some not, but if you look at this under a profiler, things start to make sense. In an application with about 5,000 translations, it takes about 500 milliseconds to load everything, which is fairly long. Looking at the bottom of the graph, most time is spent doing file stat checks to see whether files exist, which is costly. Looking further up the stack, we see the ActiveSupport::FileUpdateChecker and its execute method are the problem. It's not the checker itself, but the block it executes that is costly because talking to the file system is expensive, especially with many files.
00:08:18
How did we fix the problem? First, we filtered out everything not in the Rails route within the I18n load path. Translations outside the I18n load path or Rails route likely won't change over development, so we ignore them, significantly reducing files watched. We only delete missing files in the I18n load path if they start with the Rails route, avoiding unnecessary file stat checks. Profiling again shows a significant reduction in file stat checks. But there's still some, so we asked why are we even running the file watcher on boot? The I18n load path should be clean on boot, so if not, there's likely a bug. Removing the line stops checks on startup. Profiling here shows a 5x improvement — under 100 milliseconds now. Some file IO still occurs elsewhere, but it's much less. You can read about these optimizations in Rails PRs 52271 and 53259. Better I18n file watching is available in Rails 8.0 onwards.
00:10:56
The next slow area was Rails routing, which we can visualize: the developer makes a request, which goes up the middleware stack through routes to the controller. The controller processes data and returns a response back down the routes and middleware stack as an HTTP response. Routes are defined in config/routes.rb and this file is evaluated on boot, including any engine or manually drawn routes. The example with some root, resource, health check, and user routes looks quick, but in larger applications, it can take much longer. This is what route loading logic looks like: it is another file watcher. We create a route reloader object, tell it if the app is eager loaded, then execute to load routes.
00:13:05
In an application with about 3,000 routes, it takes about 460 milliseconds to draw routes, which is not ideal. The execute line is the slow part. We need our application's routes, but when exactly do we need them? Usually when Rails' route set is accessed — on requests, or when route helpers are used. There are other uses but these are the main ones. So, we built a smarter route set that loads itself lazily, using a subclass called LazyRouteSet. When public API is executed, we conditionally reload routes in a method override, loading routes as late as possible. We also changed the route reloader logic: the execute method sets a loaded boolean to true, letting the reloader detect if routes are loaded. We implemented execute_unless_loaded to reload routes only if not loaded. Our app's routing now executes only if routes aren't lazy, depending on when routes get drawn. This lazy route set is only used in development and test to allow fast requests and efficient forking in production. Profiling shows routes are drawn later, not during boot, improving boot time. Read more about lazy route sets in Rails PR 52353. This feature is in Rails 8.0+.
00:15:06
Finally, we noticed initializers were slow. Initializers are code run at boot, like ActiveSupport inflections — e.g., telling ActiveSupport inflector about the acronym GraphQL so it's capitalized properly in class files. Initializers can be file-based or block-based, and block-based initializers can be named and ordered using before or after keys or unordered. Therefore, Rails needs to sort initializers using TopologicalSort in Initializable::Collection class, extending Array. Topological sort sorts nodes (initializers) and their children (dependencies). Each node is an element in the array. This sorting can be confusing, so an example: four elements (0-3) with arrows indicating dependencies. Implicit ordering means initializers run after the previous element unless a before constraint exists. After adding implicit constraints, we turn the array into a directed graph and sort accordingly.
00:18:18
With this graph, we iterate through starting with init 0, then init 3; but because init 2 runs before init 3, it returns init 2 next, then init 3, and finally init 1 last. The resolved order is 0, 2, 3, 1. On an application with about 2,000 initializers, this takes about 750 milliseconds to topologically sort, with no other clues why. Possibly, the sorting itself takes most time. Why? Because 2,000 nodes times 2,000 iterations (children) equals 4 million iterations, causing redundant array scans. The select method is the problem, as it iterates the entire collection array for each node. There's probably a better approach.
00:21:11
Instead, we changed the collection implementation: we add order and resolve hashes and a collection array. The order hash tracks names, and the resolve hash tracks initializers (since names aren't unique). Adding initializers is more complex: we add to collection array, map constraints to order hash, and add initializers to resolve hash. This makes iteration easier: iterate each node in collection to preserve insertion order, and lookup each child in hashes. Sorting work is done upon insertion, eliminating redundant scanning. Profiling shows the topological sort each_child sections disappear and the process is much faster. It's a rare case of doing the same amount of work in a smarter way. Read more about faster initializer sort in PR 53615. This is not released yet but will be in Rails 8.1.
00:23:16
I discussed a lot, so let's recap and why this matters. You can and should profile your application's boot time; you might find and fix performance regressions. Translations are watched during development, and this is a lot faster in Rails 8. Routes are lazily loaded locally in Rails 8, saving time during boot and development. Initializer sorting will be much faster soon in Rails 8.1. Through these and other optimizations, we reduced the Shopify monolith boot time from about 12.5 seconds to about four seconds, and as of this week, closer to 3.5 seconds. Thank you.
00:24:56
Specifically, watcher setup and route drawing times are significantly reduced. Because so much of this work was done openly on Rails, your application should be faster too. All you need to do is upgrade. Thank you.
00:25:28
We have about five minutes left for questions if anyone has any.
00:25:43
Question: Things that are slow? Good question. What's left to improve? Rails initialization is in a good place. The other area is loading application gems, which is more complicated because applications have many dependencies. We're in a good position, but dependencies can cause performance regressions since code isn't always ours, even at Shopify. Rails initialization is tricky due to load hooks and people sometimes load constants too early, which triggers loading many other things. That's another area to watch for.
00:26:41
Question: How will you maintain these improvements over time? Do you have a ratchet or monitoring? Great question. How to maintain order in a large application with thousands of developers daily? We have CI checks to ensure load hooks aren't loaded too early. You can add load hooks that prepend to a check like 'ActiveRecord::Base is not loaded' and raise if it is, providing early CI feedback; you can also run these scripts locally. This has helped us a lot. For preventing problematic dependencies, you can add complex scripts detecting wrong constant loading, but it's an uphill battle. Regularly profiling your application or automating it (a robot doing it) with periodic reviews is a good approach. Does that make sense? Thank you.
00:27:20
Question: For applications that use Rails engines a lot, does that multiply problems? Engines are like tiny applications. Depending on sourcing, at Shopify we use many local engines, making the blast radius like a gigantic singular application. If engines or dependencies come from gems or outside Rails route, then yes, performance can be a problem like upgrading gems. So depending on engine sourcing and code authorship, it can be more complicated.
00:28:30
Question: Is there any reason you can't apply these changes to Rails 7 or 6.x? You'll need to talk to the Rails core team about the maintenance policy; I can't answer that. Thank you.
00:29:37
No more questions. Feel free to stop by the Shopify booth. I'll be there intermittently throughout the conference. Or if you catch me in the hallway, please say hi. Thank you so much.