Beyond the Hype: Practical lessons in Long-Term Rails

00:00:30.000 Okay, let's do this. First of all, thanks to the organizers for having me. If you've seen my previous talks, you know that I am committed to this style; I do a lot of animation. So, if you have any sickness problems or epileptic issues, please be careful. No, I'm just kidding. Also, I'm going to go beyond 30 minutes, so heads up. I'm Spanish, so I speak a lot.

00:00:49.360 My name is Julia López or Julia, whatever you prefer. I'm from Barcelona, and I've been doing Rails since 2011. I love doing the so-called "invisible work," the kind of work that makes other developers' lives easier and improves the experience in the background without affecting customer functionality. It's also called unrecognized work, as Jeremy said. Developers usually tend to realize: "Oh, thank you for doing that!" That's good, but sometimes bosses don't see it because they are not on GitHub, and that is okay too.

00:01:20.240 I like talks where people share their experience working in real-world scenarios—experiences that I can relate to or that validate my thoughts. I also appreciate talks where I can immediately apply what I learned at my job. So, I hope you get something out of this. My biggest takeaway after prepping this talk for weeks and thinking about it is that I will misspell the word "maintenance" every single time. So, thank you to autocorrection for helping me out with that; I still don't know how to write it correctly!

00:01:59.600 I work for Harvest, and I've been there for nine years. If you don't know, Harvest is a time-tracking tool that's been built in Rails since its very beginning in 2006 when it was born. It's not actually 20 years old, but saying so has a better impact. We've always been a small engineering team. Currently, we have 33 people in engineering, with 19 of those being developers. For me, that's small, but you might think of this as a midsize company. In total, we are less than 100 people.

00:02:31.520 First, I want to thank the people who came before me for setting everything up that I'm going to show today. I didn’t write the code that I will show, but I had to do a lot of digging to understand how we got to this point, which is a skill we all need to have to maintain code properly. You usually maintain the code written by other people or, worse yet, your code from six months ago— that code is always horrible! Depending on the size of your engineering team, this may or may not apply to you, but maybe you found a way to fix something that you could share your experience with me.

00:03:10.080 So, this talk is about maintenance in the long term. We have to remind ourselves how it all started back in 2006. This picture is the earliest I could find on the web archive. It started as an Excel file, as most SaaS companies did back in the day. When Excel doesn't do everything you need, you try to create an application out of that.

00:03:29.440 The first commit I pulled was from 19 years ago when the source code of dependencies was within the repository itself. This guy, D, I never met him; he left the company three years before I joined. I asked for information to have his name here, but his name is all over the codebase. If you Git blame, you will find his name somewhere, right? We are still maintaining his code 19 years later.

00:03:52.400 I know there is a previous version, an SVN, that I could not find, but the rumor says that Harvest started in Rails before version 1.0, so it’s probably one of the earliest Rails applications in production. What is this talk about? I will show you how we operate at Harvest to keep up with the maintenance of a 20-year-old application. Our processes change all the time as the team grows and leadership changes, and as our business needs evolve, we are constantly evolving.

00:04:39.920 One of my colleagues said on a PR, there isn't actually a carved-in-stone policy or anything for a lot of things. Even if we have more developers now, some things just do whatever you think is best. We didn't even have linting with RuboCop until a couple of years ago. So, I imagine we were surviving with inconsistent Ruby code for 16 or 17 years. I'm going to try to breeze through the basic concepts and then dive deep into what I believe is an interesting gem that we use for deploying with confidence.

00:05:17.000 So why did I call this talk "Beyond the Hype?" Apart from ChatGPT asking me, "What do you think?" it gave me "Beyond the Hype." If you, like me, attend conferences—Rails conferences, Ruby conferences—not this one, actually; this one was pretty tame about it—it's all about Hotwire, Turbo, Solid, AllTheThings. I sit there thinking, "Great! I love it; I'm hyped too! The Rails ecosystem is great, I love it!" But how many of you get to work with new Rails features every day? Right? I don’t, because I work eight hours (or more) every day, and afterwards, I want to do something else.

00:06:59.200 To put this in context, if you're familiar with Rails features like Active Storage, you know it was introduced in 2018, but we still use Paperclip. Those people are maintaining a fork of it, and we're very grateful. Active Job was introduced in Rails 4.2, but we’re currently using Sidekiq Workers, which works for us. We had no reason to switch to Active Job, and I've actually asked Russa if she could provide a reason for us to make the switch. Thus far, we are doing great with our current setup! On a different note, there are gems that we were using in the past to connect to multiple databases, but now we use Active Record's native support, and you can see my talk about that—it’s free advertisement.

00:07:52.640 Additionally, we were using another gem for WebSockets implementation, but now we use ActionCable since Rails 5. These are just a few examples showing our belief in removing dependencies and using out-of-the-box native features when we can. But when you work in a company, you need to juggle a lot of priorities that people have, which makes maintenance important. So what is maintenance, and what does maintenance look like for the long term? My question here is: who is working on an app that is 1 year old, 5 years old, or 10 years old?

00:08:41.079 A lot of people here are doing maintenance for the long term, right? Because you don't code the same way for the long term as you would for a one-off situation. If you saw the scripts that I run in the Rails console for specific tasks, those are horrible! I never share those. Or I tell people, "Look, I did this script; it's very easy!"—simple things like that which you can quickly prototype. So what is the definition of maintenance?

00:09:28.040 The definition is the work needed to keep something in good condition. However, when it comes to software, it goes beyond just keeping the app running at any cost. It also involves implementing new features, fixing user-facing bugs because if your customers complain and churn, you risk losing revenue. You also need to patch security issues, address common vulnerabilities, and keep up with security exploits. Ensuring that your dependencies are updated is crucial. Furthermore, you must deal with last-minute deprecations that might risk integrations stopping to work abruptly. For example, QuickBooks might announce that they are deprecating an API that you need to integrate with effectively, often with very minimal notice.

00:10:36.400 It's also about refactoring—prepping your code before you add new features, making it more readable, using modern patterns, and ensuring all dependencies are updated. You want to avoid ugly code. In my personal opinion, though it may not be popular, I believe in refactoring for the sake of it; I genuinely enjoy doing it. There have been times when I broke things while refactoring, but I think it was always worth it to get the codebase cleaner.

00:11:21.760 I see refactoring as an incredible opportunity to dive deep into an old codebase. Whenever I find something ugly, I think to myself, "I'm going to use my spare time to fix that" because you always get to the darkest corners of the application and down the rabbit holes, not the best places to be, but always interesting. It's also about deleting dead code, which is the best thing in the world—removing code that has not been used. You don't want to leave it there.

00:12:27.560 Proactive management of dependencies is essential for avoiding deprecated libraries. Stripe recently released a new API version, and I wanted to be proactive by updating to the latest feature set. I refer to "management" instead of just "upgrading" because it can also mean removing dependencies. If a library you are using now has a newer version that allows you to remove those dependencies, then you should do it. Rails is frequently updating its features out of the box.

00:13:32.080 Another important aspect is enhancing your test suite. This means not only adding coverage but also removing unnecessary tests. It's a controversial topic since many people don't want to remove tests, but you must take care of your CI system. Constantly monitor it to ensure that it doesn't have flaky tests and that it isn't becoming overly slow due to added tests. You need to keep a constant eye on the CI; otherwise, developers might feel the pressure because time is money.

00:14:09.680 Doing research and discovery, as Jeremy said, is crucial. You need to have a sense of what needs to be taken care of before you start implementing; then you can advocate for time to be allotted to those maintenance tasks. It's the engineering team's responsibility to encourage doing things the right way to maintain the codebase. Engineering managers may demand tighter deadlines for product launches, but they need to understand the complexity behind refactoring and maintenance. Sometimes, it's exhausting to explain that maintenance matters.

00:15:49.960 Monitoring part of maintenance goes beyond engineering. Apart from checking error-tracking tools and keeping them manageable, you must also keep an eye out for user requests and complaints. Understanding users' pain points helps you build better, future-proof code. It’s vital to establish strong communication with product owners and understand where issues are emerging from, allowing you to anticipate future demands better and improve the system ahead of time.

00:17:04.880 It's also important to improve performance before it becomes a problem. If you can detect performance issues before users enter exponentially large amounts of data, that’s ideal. It involves making a proactive effort rather than dealing with massive issues when they arise. Furthermore, sharing knowledge through good PR descriptions improves onboarding for junior developers. Comments in the code must be clear even if code is expected to be self-documenting, and documentation for other stakeholders is essential.

00:18:14.440 Knowing when to let things go is equally vital. Whether you're deep in rabbit holes or side quests, managing time effectively can be a challenge; I often find myself wishing I had unlimited time to fix everything. When working on older applications, you often identify numerous areas for improvement that you would like to pursue.

00:19:43.760 So why is maintenance important? All those tasks that I mentioned—and probably more—are crucial. It is easy to think of the usual reasons, such as frustrated users due to flaky apps, stressed developers trying to deploy, and management being frustrated because we are not iterating fast enough. All of this strict performance is due to code complexity, and we’re frequently channeling our energy into getting features to users. However, we must remember the long-term implications of decisions made today.

00:20:33.760 Technical debt is inevitable, especially after 20 years. The decisions you make today that seem solid may not apply 20 years from now when the application evolves. You will need to maintain new code structures alongside old code that still performs its function but may not align with recent best practices. We need to emphasize maintainability and create workflows that accommodate growth and changes.

00:21:41.480 I discussed workflows because you don't want to be woken by pager duty alerts. You want to feel free to deploy on Fridays and enjoy your weekend. Ideally, we want to ship new features at a reasonable speed and address security patches as soon as they arise, keeping our dependencies up-to-date while taking advantage of the newest trends in technology to boost productivity.

00:22:50.680 At Harvest, we’re usually in a pretty good environment to work. We have a skilled team that is competent in managing incidents, which helps keep things running smoothly. Now let's talk about how we approach maintenance throughout development phases. Who manages maintenance at Harvest? Is there a dedicated team responsible for general maintenance tasks, or is it the responsibility of engineering managers to prioritize these tasks? Or could there be rogue developers like me taking care of side quests?

00:23:59.960 It is a shared responsibility, clearly. We’re a small team. Maintenance cannot depend solely on one team or one person. We do not shut down engineering work or feature development to strictly address maintenance tasks. However, you may know of this rule called the "Boy Scout Rule," which suggests leaving the code better than you found it with every commit. Even a minimal improvement can help bring instant gratification.

00:24:54.760 This practice can help improve the code's quality over time while maintaining the value we deliver to our users. It’s also important to be intentional about maintenance; during project management, account for the time needed to address these issues.

00:25:55.720 For instance, you need to factor in that if you're going into a pretty old codebase, you may require extra time. We aren't particularly good at estimating. It's well-known. I learned the hard way. You might think you could implement a feature in a day, but as you delve deeper into that codebase, the complexity will likely delay your timeline.

00:26:46.640 As engineers, we should help product managers define features to ensure their feasibility. When a PM asks me whether something is possible, I usually respond, "Everything is possible!" The difference between us is that, while everything is feasible, the complexity and resource limitations can sometimes make the task unmanageable.

00:27:32.240 Once project estimations are set, how do we manage maintenance during the implementation phase? It’s proactive updates of dependencies. For example, if I'm working on a Stripe integration, I will update the Stripe gem as soon as I start working on a new feature, unless there's an overwhelming API change that would take a week to implement.

00:28:26.840 It's important to remember to use the latest patterns when implementing new features. I often remind new hires that, if a piece of code is three years old, it might be outdated, and they should check whether there might be a better way to accomplish the functionality. Also, always investigate the methods surrounding the code they’re working on; there could be opportunities to clean up redundant or outdated code as well.

00:29:12.680 After implementing the new features, the process continues with pull requests and code reviews, where maintenance also plays a key role. Try to keep your cleanups and refactors separated. Ensure your PR descriptions are complete with all necessary information about your changes so that when someone revisits that PR months or years later, everything they need is clearly laid out in the description.

00:30:16.520 Even if you believe you have everything fresh in your mind, it’s often easy to overlook details months later. While preparing for this talk, I came across many instances of complicated changes wherein vital links to project management tools were no longer relevant. A solid practice is to be verbose in your descriptions, allowing both you and any future developers to understand changes made.

00:31:28.360 At Harvest, we have a minimum number of required reviewers, ensuring that no changes can go to the main branch without approval. We advocate for maintainability during reviews. This means going beyond just what is visible in the diff; for example, if a team worked on Slack integration, did they update the Slack gem? If not, it’s worth mentioning that.

00:32:17.960 It’s also imperative to ensure that tests are kept up-to-date. I sometimes rush through tests, and various improvement areas in the test suite get overlooked. We should try to be thorough and empathetic during pull requests.

00:33:21.360 Every developer is welcome to participate in pull requests, providing feedback on past implementations or areas where improvements have been made. Keeping an eye on all pull requests helps maintain a holistic understanding of the codebase. Development squads at Harvest typically consist of two or three developers, a PM, a designer, a QA, and an engineering manager; the composition often changes, challenging the need for developers to stay familiar with every aspect over time.

00:34:07.920 After a pull request is approved, regular quality assurance testing is done in-house. This isn’t truly a mandatory step since engineers can opt to ship directly if they feel confident, but quality assurance representatives are also part of the squad. They're responsible for refining features, ensuring usability, and checking all edge cases, which can often be overlooked because engineers may not think of how users may interact with the application. They help check that features work well in terms of UX, performance, and proper documentation, covering a lot of ground.

00:35:23.680 Once QA approves, it's our job to deploy, ensuring that no regressions were introduced and following up on any new changes made based on observations in production. Now, what happens to features that we are not currently focused on? They still require active maintenance; everything needs upkeep, especially since users rely on various features. We have a team called Delta Force that is particularly proficient in handling this.

00:36:36.560 This team has been present for most of my journey at Harvest, and while it was absent for a time, it returned, and I was thrilled. The engineers rotate every sprint, and they're responsible for addressing bugs and one-off requests, and even bulk updates for users, which is inevitable in a 20-year-old application. Generally, the documentation is the code. Since the code itself is often dense, engineers need to know how to read and interpret it.

00:37:37.520 Requests may come through Delta Force when someone asks how a feature works or wants to understand a combination of things we cannot test in the production environment for various reasons. Aside from bug fixes, engineers in Delta Force often like to search for and implement low-hanging fruit improvements. The time spent improving existing code while handling other requests provides a significant opportunity for knowledge growth. It’s an incredibly valuable onboarding experience for anyone new to the company, tracing their way through everything we have.

00:39:09.600 However, maintenance goes beyond just Delta Force. In the last three years, we have formed a new team called Platform Development. Not only have our business needs grown, but our workforce has grown as well. We now operate a larger infrastructure that requires management, and we cannot depend on individuals to accomplish necessary upgrades solely on their spare time. Relying on just one engineer to introduce new technologies may lead to some challenges.

00:40:00.320 This team focuses on engineering productivity, setting up linting with RuboCop, and enhancing automations through CI/CD practices. They oversee observability, ensuring that our workflows align with our larger goals. Major upgrades, such as Rails upgrades or security patches, require more intervention and strategic planning. I am now part of this team because I enjoy diving into opportunities to enhance our systems.

00:41:12.560 We also engage in exploratory days, which occur periodically during which engineers can pursue projects they’re interested in or address personal learning goals. This ensures that development remains fulfilling and valuable. Initially, as a company, we were not strong in having engineering discussions, but we have improved. This includes having frequent requests for comments on GitHub, interactive discussions during monthly meetings among senior and lead engineers, and more opportunities to brainstorm together as we introduce new ideas.

00:42:05.920 Recently, we've started integrating AI into our workflows, wanting to remain competitive. I even tried to bring AI into this talk, but it did not execute well. I tried various tools, but none fulfilled expectations; perhaps I need to revise how I'm prompts or the tools themselves. But let's move on.

00:42:55.840 I want to discuss a couple of key topics now—refactoring, rewriting, and making decisions about enhancements amid accumulated technical debt. It’s pertinent to discuss how we make those decisions meaningfully. Some time ago, we were just jumping into the coding phase without any exploratory work; now we go into agile spikes to explore unknowns and address uncertainties related to building features.

00:43:38.480 Before implementing new features, we carry out time-bounded explorations to avoid going off the tracks. During these discovery phases, we deliver proof-of-concepts and findings that allow us to plan and estimate better for the future. While the process might seem longer, it can save valuable time, preventing engaging in pointless side quests down the line.

00:44:34.200 Ultimately, decisions to rewrite full features are made only after thorough consideration. For instance, we spent time evaluating our billing system and decided to pursue a full rewrite based on six weeks' worth of research to enhance our functionality. We also conduct maintenance based on our dependency management. I personally enjoy the satisfaction of performing a bundle update—there’s something gratifying about it!

00:45:23.440 In 2022, we recognized that we needed to take managing dependency upgrades more seriously. So, we began using a tool called Renovate, which adapts to our process rather than the other way around. It assists us in systematically organizing upgrades, and we’ve seen substantial stability improvements thanks to it.

00:46:28.760 However, we still face challenges when it comes to security. For security-related updates, we still utilize Dependabot because of its specificity to security needs. The process of managing dependencies can be complex and evolving, but utilizing tools like Renovate has helped streamline those efforts. Sharing knowledge through caring for dependencies allows developers to catch what others may overlook.

00:47:29.560 Moving forward to production involves deploying everything—new updates, dependency changes, etc.—but before anything, we must have trustworthy tests. Without reliable tests, everything falls apart; a thorough test suite needs to be established to maintain quality. We trust our testing suite to hold its ground.

00:48:24.760 This ensures that manual quality assurance work will complement our automated tests as we deploy and follow up in production. Monitoring performance is essential; we use tools like DataDog to analyze and assess application health; you must ensure that your applications are indeed working correctly.

00:49:10.360 We adopt a CI/CD approach—merging small pull requests to maintain smooth deployments. Automating these systems over time has allowed us to be confident in our production changes. We use feature flags custom-built for our applications, ensuring we don’t run long-lived branches before deployments. Feature flags enable us to easily toggle functionalities without major upheavals.

00:50:09.920 Establishing a rollback strategy for any major deployments is critical. Feature flags allow you to turn off problematic features, while reverting non-flagged changes might require full deployments, but we will try to contain any potential damage quickly. Staging environments that resemble production can help us understand the impact of deployments before going live, and through thorough planning, we ensure everything goes smoothly during releases.