Simplifying at Scale: 7 Years of Rails Architecture at Persona

00:00:16.880 Alex Cumins, a software engineer on the

00:00:18.800 infrastructure team at Persona. For the

00:00:21.119 past several years, I've been part of

00:00:22.320 the team responsible for scaling and

00:00:24.000 evolving the architecture behind our

00:00:25.760 identity platform. These days, I help

00:00:28.400 design and maintain the globally

00:00:30.240 distributed multicluster Kubernetes

00:00:32.480 setup we operate today. Although I

00:00:35.040 wasn't around for the very first commits

00:00:36.880 by our founders Rick and Charles, I know

00:00:38.800 the challenges they faced like many of

00:00:40.800 y'all in taking this from an initial

00:00:43.120 idea to a fully functional Rails

00:00:45.120 application and product. But before we

00:00:47.360 dive in, let me first give you a bit of

00:00:49.120 context on who we are at Persona and the

00:00:51.600 kind of problems our platform is built

00:00:53.199 to solve. Persona is an all-in-one

00:00:56.800 identity platform. Think of onboarding,

00:00:58.800 compliance, fraud prevention. We power

00:01:01.359 the behind-the-scenes workflows that

00:01:02.879 ensure individuals and businesses are

00:01:05.280 who they say they are. Our platform's

00:01:07.439 flexibility is what makes us stand out

00:01:08.960 and applicable to so many different use

00:01:10.560 cases.

00:01:12.240 We work with customers across a wide

00:01:14.080 range of industries, fintech,

00:01:15.920 healthcare, marketplaces, crypto,

00:01:18.320 travel, government, and more. And each

00:01:20.479 one has their own unique set of

00:01:22.080 compliance requirements, user flows,

00:01:24.080 risk tolerances, and regional

00:01:26.080 regulations. That means almost every

00:01:28.240 customer uses our platform in a slightly

00:01:30.159 different way. And as such, we've had to

00:01:32.799 architect for adaptability, a choice

00:01:34.960 that has influenced almost every part of

00:01:36.560 our stack. And with that context, let's

00:01:39.280 rewind the clock and walk through how

00:01:40.720 our Rails architecture has had to evolve

00:01:42.880 to support that level of flexibility,

00:01:44.880 scale, and diversity of use.

00:01:47.840 Persona started the same way so many

00:01:49.680 other companies have, with an idea and a

00:01:51.920 command. One we've all used to bootstrap

00:01:53.840 our grand visions. But that simple

00:01:55.520 command comes with a plethora of options

00:01:57.119 behind the scenes. Some of which are

00:01:58.560 innocuous early on, but carry deep

00:02:00.640 implications for how your system will

00:02:02.079 scale, evolve, and operate years down

00:02:04.479 the line. In our case, we launched on

00:02:06.640 Google App Engine, which gave us just

00:02:08.319 enough infrastructure to move fast

00:02:09.840 without worrying too much about

00:02:11.280 provisioning or deployments. If you're

00:02:13.760 not familiar, it's a fully managed

00:02:15.280 platform that abstracts away most the

00:02:17.200 operational complexity similar to Heroku

00:02:19.920 or a lightweight subset of what

00:02:21.280 Kubernetes offers out of the box.

00:02:24.160 We started on Rails 5.2, well into the

00:02:26.720 maturity of many Rails features. And

00:02:28.560 given that was a little over 7 years

00:02:30.160 ago, here's a refresher on some of the

00:02:32.080 major features that launched then. We'll

00:02:34.080 come back to a few of these and how we

00:02:35.519 manage scaling with them. But first,

00:02:37.200 let's talk about one that hit early and

00:02:38.879 often, the asset pipeline.

00:02:41.920 Let's time travel back to 2010, a time

00:02:44.000 when jQuery was your best friend. These

00:02:46.480 are screenshots of some actual code I

00:02:48.239 wrote in a Rails 2.3 application. You'd

00:02:50.879 manually include JavaScript files in

00:02:52.480 your templates. And yes, your actual

00:02:54.160 JavaScript logic would often live right

00:02:55.760 in your view templates, tightly coupled

00:02:57.519 to the markup it was enhancing, a

00:02:59.200 pattern that's become back in style. In

00:03:01.920 2011, Rails 3.1 introduced the asset

00:03:04.640 integrated asset pipeline. And it was a

00:03:06.400 gamecher. It gave us a structured way to

00:03:08.560 organize, bundle, and minify assets. In

00:03:11.360 the lower example, we include a file

00:03:12.959 named user tabs.js

00:03:15.760 to be executed on the page. But what

00:03:17.920 happens when that file changes? We

00:03:19.840 generally want browsers to cache script

00:03:21.440 content for performance reasons. But the

00:03:23.760 if the file name stays the same, users

00:03:25.519 might keep getting the old version even

00:03:27.360 after we've deployed new code. The asset

00:03:30.080 pipeline solved this with

00:03:31.120 fingerprinting, appending a unique hash

00:03:33.519 to the file name. With a new file name,

00:03:35.440 the browser requests the file, giving us

00:03:37.200 the best of both worlds. Long-term

00:03:38.879 caching when things don't change and

00:03:40.400 instant updates when they do. The

00:03:43.120 integrated asset pipeline has evolved

00:03:44.959 significantly, but it originally meant

00:03:46.879 sprockets with defaults of coffecript

00:03:48.799 and SAS, languages that compile to

00:03:50.799 JavaScript and CSS, respectively. And of

00:03:53.519 course, yeah, respectively. And of

00:03:55.599 course, we already saw as we already saw

00:03:57.120 a healthy dose of jQuery, which was the

00:03:59.519 go-to solution for modern front-end

00:04:01.519 interactivity at the time. Options for

00:04:04.400 what we now consider a full-fledged

00:04:05.920 front-end framework were limited. The

00:04:07.840 first release of what would eventually

00:04:09.120 become Ember came later in 2011, led by

00:04:11.920 YehudaZ, a member of the Roy Rails core

00:04:14.000 team. The front-end landscape quickly

00:04:16.400 exploded though in the following years.

00:04:17.919 New frameworks, modern JavaScript

00:04:19.600 features, many of which were influences

00:04:21.759 from coffecript, richer UIs, and rising

00:04:24.320 expectations from users and product

00:04:25.840 teams alike. Ember along with other

00:04:28.000 fledgling frame frameworks like

00:04:29.600 Angular.js JS and React shifted more of

00:04:32.080 the UI into the browser and with that

00:04:34.720 Rails took an increasingly took on the

00:04:36.960 role of an API provider rather than a

00:04:38.800 full page renderer.

00:04:41.280 With that explosion, the other major

00:04:43.040 responsibility left on Rails was

00:04:45.120 coordinating with a front-end build

00:04:46.560 system in the asset pipeline. That

00:04:48.880 resulted in the Webpacker gem, an MPM

00:04:51.120 package that was added as an option with

00:04:52.720 Rails 5.1 in 2016 and became the default

00:04:56.000 with Rails 6.0 in 2019. Persona started

00:04:59.199 squarely in the middle of that

00:05:00.320 evolution, starting with React and

00:05:01.919 TypeScript through Webpacker. Webpacker

00:05:04.320 set out to connect Rails with modern

00:05:05.840 front-end tooling and for a while it

00:05:08.080 did. But as complexity grew, we started

00:05:10.160 to see the cost. Hardy debug

00:05:12.000 configurations, slow feedback loops, and

00:05:14.320 lagging support for emerging tools. The

00:05:16.400 break neck pace of front-end innovation

00:05:18.160 made it nearly impossible for Webpacker

00:05:19.919 to keep up, turning what was meant to be

00:05:21.759 a bridge into a constant game of

00:05:23.680 catch-up. and all the while struggling

00:05:25.840 to reconcile Rails's opinionated

00:05:27.919 defaults and focus on quick

00:05:29.440 implementation with the extensibility

00:05:31.360 and configurability of modern build

00:05:33.120 systems. In the end, we've moved from

00:05:35.759 Webpacker to Shakaacker and now we've

00:05:37.919 adopted VIT, a modern native JavaScript

00:05:40.479 build tool that's fast, flexible, and

00:05:42.400 designed for today's work front-end

00:05:44.160 workflows. And that's been a recurring

00:05:46.240 theme for us. Rails gives you great

00:05:48.160 defaults, but you're not locked in. When

00:05:50.320 the built-in tools no longer fit your

00:05:52.000 scale, your team, or your workflows,

00:05:54.320 it's okay to step outside the box and

00:05:56.000 bring in what works.

00:05:58.800 As we started to grow, we started to hit

00:06:00.560 the natural limits of Google App Engine.

00:06:02.240 It had served us well during our early

00:06:03.919 days, giving us speed and simplicity

00:06:05.600 when we needed it most. But eventually,

00:06:07.680 the trade-offs became too hard to

00:06:09.199 ignore. We needed more flexibility

00:06:11.280 around how we structured services, or

00:06:13.120 really just to easily deploy multiple

00:06:14.800 services at all. To set the stage,

00:06:17.199 picture what scaling looked like in our

00:06:18.720 early days. A service would spike, an

00:06:20.639 alert would fire, and someone would jump

00:06:22.080 in and manually scale things up or down.

00:06:25.440 This is a slide from one of our all

00:06:26.960 hands meetings in April of 2020, showing

00:06:29.440 five straight days of manual scaling

00:06:31.520 operations, scaling services up and

00:06:33.520 down, sometimes multiple times a day,

00:06:35.600 just to keep things running smoothly.

00:06:37.600 that moment looking at this slide and

00:06:39.759 realizing how much of our energy was

00:06:41.360 going into keeping just the lights on,

00:06:43.680 it was a clear signal it was time to

00:06:45.440 grow into something more sustainable.

00:06:48.720 So, we knew we had to move off of Google

00:06:50.240 App Engine. But what options were

00:06:52.160 actually viable? The most basic option

00:06:54.639 would be to run raw virtual machines,

00:06:56.479 maybe wrapped in Terraform and managed

00:06:58.319 with Anible or some other homegrown

00:06:59.840 tooling. Technically, that probably

00:07:02.319 would have worked, but it would have

00:07:03.680 meant taking on a ton of operational

00:07:05.280 complexity ourselves, solving problems

00:07:07.039 that much more mature tools had already

00:07:08.880 solved. Another option was to leave GCP

00:07:11.840 entirely and move to AWS or Azure for a

00:07:15.840 different platform as a service. But

00:07:17.680 realistically, that wouldn't have

00:07:18.960 guaranteed a solution to any of our core

00:07:20.800 issues, and it would have added a

00:07:22.240 massive migration on top of an already

00:07:24.080 complex problem. After weighing the

00:07:26.560 options, we decided on something that

00:07:27.919 gave us the control we needed without

00:07:29.599 starting from scratch. Kubernetes via

00:07:31.599 GKE. GKE is Google Cloud's managed

00:07:34.560 Kubernetes offering. It handles the

00:07:36.319 heavy lifting of cluster provisioning,

00:07:38.960 upgrades, and node management while

00:07:40.639 still giving us substantial operational

00:07:42.240 control.

00:07:44.400 Making the jump from Google App Engine

00:07:45.919 to Kubernetes wasn't just a change in

00:07:47.520 deployment systems. It was a fundamental

00:07:49.520 shift in how we thought about

00:07:50.560 infrastructure. App Engine handled most

00:07:52.639 of the heavy lifting for us.

00:07:53.759 Provisioning, scaling, networking, even

00:07:55.520 deployment, all abstracted behind a few

00:07:57.599 CLI commands. But that simplicity came

00:07:59.919 at the cost of control. Migrating

00:08:01.840 Kubernetes gave us flexibility,

00:08:03.520 observability, and granular control, but

00:08:06.080 required a maturity leap in tooling and

00:08:08.319 practices because it asked us to take

00:08:10.000 ownership of every part of the stack.

00:08:11.919 From networking and observability to

00:08:13.440 deploy workflows and access control, we

00:08:15.440 suddenly had a lot more flexibility and

00:08:17.759 a lot more responsibility. Let's take a

00:08:20.240 sideby-side look at how each platform

00:08:21.840 handled the key components of our

00:08:23.039 infrastructure and what changed when we

00:08:24.720 made the switch on a on App Engine. You

00:08:27.919 simply push code and Google takes care

00:08:29.199 of the compute. No servers to provision

00:08:31.199 or orchestration tools to configure. In

00:08:33.519 Kubernetes, you manage the full life

00:08:35.039 cycle of containers and the nodes they

00:08:37.200 run on. depending on the cloud vendor or

00:08:40.000 on prem that can vary between relatively

00:08:42.399 easy with managed services like EKS and

00:08:45.279 GKE to fully hands-on if you're running

00:08:47.839 your own control plane and node

00:08:49.040 infrastructure.

00:08:50.560 GA also doesn't support GPUs or other

00:08:52.800 specialized compute resources which have

00:08:54.640 become increasingly common in the mo as

00:08:56.720 modern workloads have exploded in

00:08:58.480 popularity and utility and those now

00:09:01.360 power critical parts of Persona's

00:09:03.040 platform like document analysis,

00:09:04.880 biometric matching and real-time image

00:09:06.880 processing.

00:09:08.240 When it comes to controlling your

00:09:09.440 application scaling behavior, App Engine

00:09:11.440 allows you to define targets for

00:09:12.560 CPUization and concurrent requests, but

00:09:15.440 that's about it. Kubernetes on the other

00:09:17.440 hand gives you fine grain control with

00:09:19.040 the ability to look at both system and

00:09:20.480 custom metrics in addition to being able

00:09:22.399 to scale both horizontally and

00:09:23.920 vertically. It even supports custom

00:09:25.760 scaling logic through integrations with

00:09:27.440 external metrics APIs and other

00:09:29.440 controllers, making it highly extensible

00:09:31.760 whether you're scaling based on QEP,

00:09:33.440 request latency, web hooks, or any other

00:09:35.440 signal relevant to your application. As

00:09:37.760 we alluded to earlier, the this

00:09:39.360 flexibility was a key driver in our

00:09:40.959 migration to Kubernetes. App Engine

00:09:42.959 scaling model is heavily geared towards

00:09:44.800 request response web traffic and it

00:09:46.640 didn't handle background job processing

00:09:48.399 like what we do with Sidekick very well.

00:09:50.480 We needed more control over how and when

00:09:52.720 workers scaled, especially under our

00:09:54.800 very bursty workloads workloads and

00:09:57.519 Kubernetes gave us the tools to do that.

00:09:59.600 I'll be honest though, our move to

00:10:01.120 Kubernetes didn't immediately remove our

00:10:02.880 manual scaling desires. It took time,

00:10:05.360 experience, and a bit of patience to

00:10:07.440 craft hor uh horizontal pod autoscalers

00:10:10.000 that met our needs. But once we got

00:10:12.000 there, it changed the game. The system

00:10:13.760 finally started working with us, not

00:10:15.360 waiting for us to catch up.

00:10:17.839 Networking, like compute, is fully

00:10:19.519 managed by App Engine. There are no load

00:10:21.120 balancers to configure unless you'd

00:10:22.480 explicitly like to do so. And simply

00:10:24.240 deploying your application gives you a

00:10:25.600 publicly accessible endpoint out of the

00:10:27.120 box. With Kubernetes, you're empowered

00:10:29.120 with services, ingresses, gateways, and

00:10:31.519 a host and a host of related objects and

00:10:33.519 configuration knobs. You can build

00:10:35.360 complex load balancing strategies with

00:10:37.279 traffic routed across multiple services,

00:10:39.120 paths, or backends, all without leaving

00:10:41.600 the Kubernetes ecosystem. But with that

00:10:43.600 power comes responsibility. You now have

00:10:45.839 to manage DNS, TLS, health checks,

00:10:48.320 firewall rules, and more. All of which

00:10:50.880 can be can add operational overhead if

00:10:53.040 not carefully designed and properly

00:10:55.200 configured. It's not uncommon to see

00:10:57.519 deployments missing critical pieces like

00:10:59.519 readiness probes or ingress annotations

00:11:01.519 leading to flaky traffic routing, failed

00:11:03.760 rollouts, or subtle production issues.

00:11:07.680 App Engine makes observability

00:11:10.000 effortless. Logs and metrics are

00:11:11.760 automatically captured and integrated

00:11:13.200 into Google Cloud's monitoring tools

00:11:14.800 with minimal setup. It's simple,

00:11:16.640 consistent, and good enough for many use

00:11:18.240 cases right out of the box. In contrast,

00:11:20.320 Kubernetes gives you a blank slate. You

00:11:23.040 have the freedom to plug in manage

00:11:24.240 observability platforms like data dog or

00:11:26.480 build out your own stack with tools like

00:11:28.640 Prometheus. That freedom is powerful,

00:11:30.880 but it also means you're responsible for

00:11:32.640 wiring it all together, deciding what to

00:11:34.720 measure, and making sure nothing falls

00:11:36.800 through the cracks.

00:11:38.880 So, while the cub the move to Kubernetes

00:11:40.640 gave us the control and flexibility we

00:11:42.240 needed to scale our infrastructure, it

00:11:44.079 also came with new complexity that we

00:11:46.399 had to learn to manage carefully. Our

00:11:48.399 deployment infrastructure wasn't the

00:11:49.839 only thing that had to evolve. As our

00:11:51.920 usage grew, one of the next places we

00:11:53.839 felt real pressure was in our database

00:11:55.360 layer. As our product and customer base

00:11:57.839 evolved, so did our data in volume,

00:12:00.000 structure, and complexity.

00:12:02.800 In mid 2022, we began sharding our

00:12:05.200 application to address the growing

00:12:06.639 pressure on our primary MySQL cluster.

00:12:08.880 Starting by adding a second shard in the

00:12:10.480 same compute location. And just to keep

00:12:12.480 things interesting, we kicked off work

00:12:14.079 at the same time to add a third shard.

00:12:15.920 This time in Europe, driven by data

00:12:17.760 residency requirements that called for

00:12:19.360 isolating customer data within specific

00:12:21.120 jurisdictions. In the span of just 6

00:12:23.600 months, we went from one database and

00:12:25.600 one Kubernetes cluster in one region to

00:12:28.240 three shards across two regions and an

00:12:30.639 additional Kubernetes cluster to support

00:12:32.480 it all.

00:12:34.560 Rather than relying upon a single

00:12:36.000 database, we use a combination of MySQL

00:12:37.920 and MongoDB as our primary data stores

00:12:40.240 along with Elastic Search for search and

00:12:41.920 indexing workloads and Reddus for

00:12:43.680 caching, psychic cues, and other

00:12:45.519 ephemeral data. Each of these systems

00:12:47.760 brings its own strengths and its own

00:12:49.600 operational challenges, especially in

00:12:51.600 cloud managed environments. Choosing the

00:12:53.839 right one is only half the battle.

00:12:55.600 Scaling, tuning, and managing them in

00:12:57.760 production is where the real work

00:12:59.040 begins. While MongoDB offers native

00:13:01.600 support for sharding, making horizontal

00:13:03.680 scaling more straightforward, MySQL

00:13:06.240 posed a much harder challenge. Sharding

00:13:08.160 our relational data meant untangling

00:13:09.920 assumptions deeply buried in our

00:13:11.360 application code and schema. And that's

00:13:13.680 where we'll focus next.

00:13:16.320 Rails has only recently started offering

00:13:18.399 official support for sharding, but

00:13:19.760 applications have been hacking around it

00:13:21.360 that limitation for years. Rails 6.0

00:13:24.639 added support for configurable database

00:13:26.639 connections by model, allowing

00:13:28.160 applications to route specific models or

00:13:30.079 even reads versus writes to different

00:13:31.760 database instances using the connects to

00:13:33.519 and connected to APIs. This is

00:13:36.000 effectively known as vertical sharding

00:13:37.519 where you split entire tables or domains

00:13:39.760 across databases. One database might

00:13:41.839 handle user data, another might handle

00:13:43.440 payments or autolocks. It's relatively

00:13:45.440 straightforward because the location of

00:13:46.880 each type of data is fixed. You always

00:13:49.040 know which database to query based on

00:13:50.560 the model. Importantly though, this laid

00:13:53.519 the groundwork for what we typically

00:13:54.639 think of when we say sharding,

00:13:56.079 horizontal sharding. Splitting rows of

00:13:58.160 the same table across multiple

00:13:59.519 databases, each holding a different

00:14:01.199 slice of the data, but sharing the same

00:14:02.720 schema.

00:14:05.440 Long- aaited, Rails 6.1 introduced

00:14:07.760 native horizontal sharding. It was

00:14:09.760 finally relatively easy to support

00:14:12.480 multiple shards of the same model in

00:14:14.160 your application. When we set out to

00:14:16.160 shard our MySQL cluster, we weren't

00:14:17.839 starting with a clean slate. We were

00:14:19.519 adapting a growing rapidly changing

00:14:21.600 Rails application to pattern to a

00:14:23.279 pattern the framework had just recently

00:14:24.959 begun to support and as you can imagine

00:14:27.199 that came with its own set of surprises.

00:14:29.680 Rails's connected help to helper is an

00:14:32.000 essential building block for sharding.

00:14:33.360 It lets you swap the database connection

00:14:35.040 on the fly based on context. Think of it

00:14:37.680 like a railway system. Each shard is a

00:14:39.680 different destination and connected to

00:14:41.279 is a is the track switch. Before the

00:14:43.920 train, your request job or rig task

00:14:46.880 leaves the station. you need to flip the

00:14:48.560 switch to send it down the right track.

00:14:50.560 If you forget or flip the wrong one,

00:14:52.880 your data ends up at the wrong terminal

00:14:54.560 or worse on a collision course with

00:14:56.160 something else. And just like that, and

00:14:58.240 just like in a real railway system, you

00:14:59.920 can't expect the train to figure it out

00:15:01.519 mid route. This context has to be set up

00:15:04.399 front. In practice, this means your

00:15:06.240 codebase needs to use that building

00:15:07.519 block everywhere you're interacting with

00:15:09.600 data. No small feat, even in a midsize

00:15:11.920 Rails application.

00:15:14.639 Threading shard context through an app

00:15:16.160 isn't necessarily hard in isolated

00:15:18.079 cases, but it requires discipline and

00:15:20.000 consistency. For job processing, it's

00:15:22.240 relatively straightforward since you've

00:15:23.600 already looked up the shard by querying

00:15:25.440 the record. In our case, we added a

00:15:27.519 query parameter to the global ids of

00:15:29.199 objects passed to the jobs which

00:15:30.959 indicated the correct shard, allowing

00:15:32.560 the job to reconnect to the right

00:15:33.839 database when it runs.

00:15:36.560 For web requests, though, it's a bit

00:15:38.079 trickier. You're now forced to rethink

00:15:39.760 what information is required to route a

00:15:41.360 request and make sure that shard context

00:15:43.440 is both available and trustworthy by the

00:15:45.600 time the controller code runs. Take a

00:15:47.600 public API as an example. You're

00:15:49.360 probably identifying requests with an

00:15:50.800 API key. That becomes your routing key,

00:15:53.199 a piece of context that tells you which

00:15:54.560 shard the request should go to. But

00:15:56.480 that's only half of the equation. You

00:15:58.320 now need some kind of directory or

00:15:59.920 lookup table, a centralized way to map

00:16:01.759 that routing key, API key, object token

00:16:04.560 to the correct shard. Notice the fine

00:16:06.720 shard example in this call in this

00:16:08.880 example from our application. Without

00:16:10.800 that layer of indirection, you're left

00:16:12.560 hard coding assumptions into your app

00:16:14.160 and that just doesn't scale. And here's

00:16:16.160 where things get even more interesting.

00:16:17.839 That lookup table might need to scale

00:16:19.519 far more than you'd initially expect. In

00:16:22.320 many cases, you're supporting APIs with

00:16:24.480 unchangeable contracts. Maybe they're

00:16:26.560 embedded in physical hardware, IoT

00:16:28.720 devices, or distributed SDKs in mobile

00:16:31.279 apps that can't be easily updated. That

00:16:33.920 means every request to your platform,

00:16:35.519 even the very first one, needs to hit

00:16:37.360 the right shard with no opportunity for

00:16:39.519 client side logic to help.

00:16:43.279 As a result, what looks like a simple

00:16:44.959 lookup turns into a high throughput, low

00:16:47.519 latency critical code path. One that

00:16:50.079 needs to be highly available, globally

00:16:52.000 accessible, and fast enough to sit in

00:16:54.000 front of any userfacing request.

00:16:56.800 At Persona, we solved this by backing

00:16:58.639 our lookup table with MongoDB, which

00:17:00.480 makes it very easy to support globally

00:17:02.160 distributed read replicas with minimal

00:17:04.079 operational overhead. That allowed us to

00:17:06.799 serve shard routing lookups close to the

00:17:08.640 user no matter where the request is

00:17:10.160 processing because the routing logic

00:17:12.400 sits in the critical path of almost

00:17:13.760 every request, especially in our public

00:17:16.160 or SDK facing APIs. Having that data

00:17:19.199 available fast and everywhere was

00:17:21.199 non-negotiable. As of last week, we had

00:17:23.839 just over a billion entries in that

00:17:26.000 lookup table.

00:17:28.720 Long before we even needed to shard, we

00:17:30.640 were already feeling the pressure of

00:17:31.840 working with large MySQL tables. As

00:17:34.000 usage grew, certain tables ballooned in

00:17:36.080 size, and that brought a new class of

00:17:37.840 problems. Slow queries, painful

00:17:39.919 migrations, unpredictable query plans,

00:17:42.320 and operational risk from even simple

00:17:44.080 schema changes. And while sharting helps

00:17:46.240 you scale horizontally, it doesn't

00:17:47.840 eliminate those problems. In fact, it

00:17:49.600 can make them even harder to manage. Now

00:17:51.679 you're not just maintaining one large

00:17:53.440 table. You're maintaining that same

00:17:54.799 large table across a multiple shards.

00:17:57.200 Every schema change, every index tweak,

00:17:59.600 every and every performance fix now has

00:18:01.360 to be repeated across end databases.

00:18:05.760 Schema changes on large MySQL tables can

00:18:08.480 be deceptively dangerous. By default,

00:18:10.720 operations like add column, modify or

00:18:12.960 drop index can lock the table, block

00:18:15.679 reads or writes and introduce unexpected

00:18:17.919 performance regressions, especially if

00:18:19.840 that table is in the critical path. For

00:18:21.760 a long time, tools like Perona's PT

00:18:23.840 online sk online schema change and the

00:18:26.400 LHM large headron migrator migrator

00:18:29.360 originally open sourced by SoundCloud

00:18:31.120 and now maintained by Shopify, have seek

00:18:33.360 to bridge that gap. For extremely large

00:18:36.000 tables though, it can result in

00:18:37.200 migrations taking weeks, potentially

00:18:39.120 stalling the work of other engineers or

00:18:41.120 changing query planning results and

00:18:42.880 slowing down unrelated parts of your

00:18:44.480 application. Recent versions of MySQL,

00:18:47.039 particularly 8.0, support more instant

00:18:48.960 DDL operations like adding and removing

00:18:51.200 some columns or modifying default values

00:18:53.520 without requiring full table rebuilds.

00:18:57.039 That still leaves things like index

00:18:58.640 creation requiring full rebuilds. So

00:19:00.720 that's where access pattern design

00:19:02.080 becomes essential. One of the biggest

00:19:03.840 challenges with large MySQL tables is

00:19:05.679 that small inefficiencies at scale

00:19:08.640 really start to hurt. A missing index, a

00:19:11.039 poorly chosen primary key, or an

00:19:12.640 unexpected query plan might be invisible

00:19:14.480 with 100,000 rows, but with 100 million,

00:19:16.720 it becomes a problem you can't ignore.

00:19:18.799 It's not enough to model your schema

00:19:20.240 around the shape of your data. You have

00:19:22.559 to model around how that data will be

00:19:24.000 queried, filtered, and joined in real

00:19:26.320 application usage. If you've worked with

00:19:28.480 NoSQL systems like DynamoB or MongoDB,

00:19:32.080 this mindset may already be familiar

00:19:33.840 where you have to design your schema

00:19:35.120 around your queries, not your entities.

00:19:37.679 In relational databases, that kind of

00:19:39.679 upfront thinking is often overlooked,

00:19:41.919 partly because you can get away with it,

00:19:44.320 especially early on, and it lets you

00:19:46.080 build faster. But as tables grow and

00:19:48.080 usage scales, those early shortcuts

00:19:50.160 start turning into real pain.

00:19:53.039 Large tables also introduce challenges

00:19:54.720 when needing when teams need to run back

00:19:56.240 fills. Not just because they take a long

00:19:58.160 time, but because they can

00:19:59.280 unintentionally impact performance.

00:20:01.280 Depending on how the backfill is

00:20:02.559 executed, it can evict hot pages from

00:20:04.640 the MySQL buffer pool, alter index

00:20:06.720 statistics, or disrupt caching behavior.

00:20:08.799 All of which can degrade query

00:20:10.320 performance in unpredictable ways. Of

00:20:13.360 all the table, large tables at Persona,

00:20:15.600 the top two are from a familiar Rails

00:20:17.679 component.

00:20:20.160 Active storage was introduced to

00:20:22.080 simplify interactions with files stored

00:20:23.679 in cloud object stores like S3 or GCS

00:20:26.720 and provides a flexible attachment

00:20:28.240 system that makes it very easy to

00:20:29.679 associate with any active record model.

00:20:31.840 It's built around two main tables. Blobs

00:20:34.240 which represent the actual files in the

00:20:35.919 in the object store and attachments a

00:20:37.840 polymorphic join table that connects

00:20:39.760 those blobs to an application record. At

00:20:42.559 Persona, these two tables represent the

00:20:44.640 top two by row count in our application.

00:20:47.600 Since each file is attached to exactly

00:20:49.440 one record, their row counts are nearly

00:20:51.280 identical at around 3.4 billion. That

00:20:55.120 makes them the perfect storm. They're

00:20:56.640 huge, they're hot, and they're hard to

00:20:58.559 touch, which becomes especially painful

00:21:01.200 when you need to back fill metadata,

00:21:02.880 migrate attachments, or optimize access

00:21:05.039 patterns. And modifying models outside

00:21:07.440 of your application, whether or not

00:21:09.120 they're Rails components or external

00:21:10.799 gems, is particularly tricky. Given

00:21:13.440 these challenges, we've started to

00:21:15.039 migrating to Shrine, which takes a more

00:21:16.640 lightweight modular approach, using

00:21:18.960 fields directly on individual records to

00:21:20.799 track the file attachments instead of

00:21:22.559 requiring a separate join model.

00:21:25.360 Sometimes though, it's the core of your

00:21:27.280 system. It's not the core of your system

00:21:29.039 that causes the most pain. It's the

00:21:30.640 abstractions you thought you didn't need

00:21:32.320 to think about.

00:21:34.799 We've been talking a lot about MySQL,

00:21:36.400 and that's intentional. It's the

00:21:37.760 database powering many Rails

00:21:38.960 applications. But many of these lessons

00:21:40.720 aren't unique to MySQL. You'll encounter

00:21:42.400 them with any relational database. At

00:21:44.640 scale, every data store brings its own

00:21:46.640 set of challenges, and we face them all.

00:21:48.480 We've dealt with hot shards on MongoDB,

00:21:50.400 fought through index tuning and cluster

00:21:51.840 pressure on Elastic Search, and even had

00:21:53.520 to shard Reddus to keep up with psychic

00:21:55.120 throughput. The reality is no database

00:21:57.760 stays easy forever. Once you're

00:21:59.679 operating at scale, even the managed

00:22:01.760 parts of your stack demand careful

00:22:03.440 planning, constant tuning, and a good

00:22:05.520 dose of humility. While I'd love to

00:22:07.919 unpack all those war stories, we simply

00:22:10.080 don't have time today. You might be

00:22:11.760 wondering though, where's the simplicity

00:22:13.280 in all this? That leads me into what

00:22:15.520 we're working on now, an intentional

00:22:17.200 return to simplicity.

00:22:19.679 You might be wondering why this slide

00:22:21.200 says one Kubernetes cluster and one

00:22:22.720 MySQL cluster. After everything we

00:22:24.640 talked about, that probably sounds

00:22:26.400 backwards. We didn't shrink. We didn't

00:22:29.039 suddenly stop needing to scale. What we

00:22:31.200 did was simplify. We took everything we

00:22:33.440 learned, the patterns, the guardrails,

00:22:35.600 the winds, the pain points, and

00:22:37.520 restructured it into a single

00:22:39.039 consolidated deployment model designed

00:22:41.120 to reuse, designed for reuse with strong

00:22:43.840 tenency boundaries and predictable

00:22:45.440 growth curves. It's not a step

00:22:47.440 backwards. It's a step of years of

00:22:49.520 experience teaching us that for the way

00:22:51.919 we scale, simplicity might be the only

00:22:53.919 way to do it without losing your mind.

00:22:57.679 This architectural shift is a project we

00:22:59.600 call stacks. The idea was simple.

00:23:02.000 Instead of scattering complexity across

00:23:03.440 multiple clusters, databases, and other

00:23:05.679 systems, we define a single

00:23:07.360 self-contained unit that could run our

00:23:09.039 full platform.

00:23:10.880 And then we replicate it again and again

00:23:14.640 and again.

00:23:19.280 In some ways, this architecture

00:23:20.880 resembles what was often called single

00:23:22.480 tenency, where some where each customer

00:23:24.480 gets their own isolated environment. In

00:23:27.200 our case, though, it's a bit more

00:23:28.400 nuanced. Each stack isn't necessarily

00:23:30.720 tied to one customer. It's more like a

00:23:33.120 self-contained runtime that can host

00:23:34.799 many tenants, but with strong boundaries

00:23:36.799 between the stacks themselves. So, while

00:23:38.640 we borrow some of the benefits of single

00:23:40.000 tenency, like isolation, blast radius

00:23:42.159 reduction, and operational flexibility,

00:23:44.320 we don't take on the overhead of

00:23:45.520 spinning up a new environment for every

00:23:46.880 single customer, essentially a middle

00:23:48.559 ground. While the main components of

00:23:50.400 each stack are isolated, introdu

00:23:52.159 including their own databases. There are

00:23:54.080 still a few database back services that

00:23:55.520 we share across all customers. Chief

00:23:57.600 among them is the lookup table we

00:23:59.120 discussed earlier which help routes the

00:24:00.960 requests to the correct stack.

00:24:03.520 We call these cores centralized systems

00:24:06.080 that power critical functionality across

00:24:08.159 all our environments while everything

00:24:09.679 else remains stack specific.

00:24:13.039 Routing a request to the right stack

00:24:14.880 isn't all that different from than what

00:24:16.480 we had to do with sharding. It has to be

00:24:18.080 correct from the very beginning. Just

00:24:20.159 like with database sharding, there's no

00:24:21.679 room for ambiguity. Once a request hits

00:24:23.679 our edge, we need to know exactly which

00:24:25.440 stack should handle it. If we get it

00:24:26.880 wrong, the request simply fails. So,

00:24:28.960 this isn't just a routing concern. It's

00:24:30.880 a critical correctness boundary. Let's

00:24:33.120 walk through a real world example to see

00:24:34.799 how all this comes together. Though,

00:24:36.640 just to be clear, we're pretending this

00:24:38.400 is the actual map of all of our edge and

00:24:40.000 main compute locations. The real one is

00:24:41.760 a bit too dense and not nearly as slide

00:24:43.600 friendly, but this gives you the general

00:24:45.200 idea hopefully. Consider the green

00:24:47.120 triangles to be our edge locations and

00:24:48.880 the coral cans as our main compute. Say

00:24:51.440 you were make a request from here in

00:24:53.279 Philadelphia.

00:24:54.960 That request would get routed to the

00:24:56.320 nearest edge location. That might be as

00:24:58.320 close as down the street or a few

00:25:00.159 hundred miles away. Kind of depends on

00:25:02.080 how the internet's behaving that day.

00:25:04.559 Let's say that happens to be in

00:25:05.679 Virginia, a pretty short hop at light

00:25:07.440 speed. To make stack routing work, we

00:25:10.000 run code at the edge as close to the

00:25:11.919 user as possible. This layer inspects

00:25:14.240 each incoming request and parses out the

00:25:15.919 key routing metadata.

00:25:18.640 Things like object tokens, API keys,

00:25:21.039 sessions. We'll first attempt to look

00:25:23.679 that key in a local cache. For things

00:25:26.559 like API keys where callers are

00:25:28.159 typically isolated to one or two

00:25:29.919 locations, we see a really high cache

00:25:32.320 hit rate, which means we can route that

00:25:34.320 request to the correct stack almost

00:25:36.000 instantly. In the roughly 7% of the time

00:25:38.559 that we miss, like for routing keys that

00:25:40.559 have collars spread across many

00:25:41.840 locations and don't frequently repeat,

00:25:44.480 we we'll query the lookup table in

00:25:45.919 MongoDB. We have read replicas and

00:25:48.960 distributed across the globe. So we're

00:25:50.640 able to make decisions in under 150

00:25:52.320 milliseconds for 95% of those cache miss

00:25:55.279 requests. Now that we've determined

00:25:57.600 where your request should go, we'll

00:25:58.960 actually route it there.

00:26:01.600 This time to a stack in Europe where the

00:26:03.520 request will be processed. All in this

00:26:06.400 approach allows us to introduce this

00:26:07.840 architecture change with zero

00:26:09.520 modifications to customer

00:26:10.799 implementations. No SDK updates, no

00:26:13.840 endpoint changes, no new headers, just a

00:26:17.360 clean cleaner, more scalable back-end

00:26:19.279 infrastructure that works exactly the

00:26:20.720 same from the outside. That kind of

00:26:22.640 seamless evolution is hard to pull off,

00:26:24.720 but when it works, it's one of the

00:26:26.080 clearest signs that your abstractions

00:26:27.600 are holding up.

00:26:30.960 While these stories have been about how

00:26:32.400 we manage the last seven years, scaling

00:26:34.240 Rails, taming complexity, and evolving

00:26:36.320 our architecture, they're really about

00:26:38.480 something bigger. And we've learned a

00:26:40.000 few lessons along the way. Complexity is

00:26:42.240 inevitable, but if you're intentional,

00:26:44.320 you can choose where that complexity

00:26:45.600 lives. Rails gives us great defaults,

00:26:47.919 and we've embraced them. But we've also

00:26:49.919 learned not to treat those defaults as

00:26:51.360 constraints.

00:26:52.880 Simplify where you can, scale where you

00:26:54.880 must, and don't be afraid to step

00:26:56.320 outside the box when necessary. I

00:26:58.480 appreciate you all joining me today to

00:27:00.080 hear some of our war stories. And I hope

00:27:01.840 that some of these lessons help you on

00:27:03.039 your own journey scaling Rails, whether

00:27:04.799 you're just starting out or deep in the

00:27:06.400 trenches.

00:27:08.080 On a final note, this is the Persona

00:27:09.679 team we have at RailsCom this week.

00:27:11.120 You'll be able to find us in the at the

00:27:12.480 Persona booth back there above in the

00:27:14.799 the floor above in in the Liberty foyer

00:27:17.679 or around at sessions. We'd love to chat

00:27:19.520 with you. We are hiring. Thank you very

00:27:21.840 much.

Simplifying at Scale: 7 Years of Rails Architecture at Persona

Simplifying at Scale: 7 Years of Rails Architecture at Persona

Overview

Persona’s Platform and Challenges

Key Architectural Evolution Steps

Return to Simplicity: The "Stacks" Model

Key Takeaways