Summarized using AI

Zero Downtime Deploys Made Easy

Matt Duncan • November 01, 2012 • Denver, Colorado • Talk

In the talk titled "Zero Downtime Deploys Made Easy," Matt Duncan discusses the challenges and strategies associated with deploying updates in web applications while avoiding downtime. Presenting at RubyConf 2012, Duncan focuses on several key areas relevant to Rails development, specifically emphasizing database migrations, background worker updates, and interactions with external services. Here’s a concise breakdown of the main themes and insights:

  • Introduction to Zero Downtime Deploys:

    • Matt Duncan introduces his affiliation with Yammer and acknowledges the complexity of deploying without introducing downtime. He reveals that while he hopes to provide solutions, there are no simple answers to this challenge.
  • Database Migrations:

    • Database changes can often lock tables, which can lead to extended downtime. Timothy illustrates this problem with an example of adding an admin column under non-null constraints, which would lock and slow down the site due to extensive updates in large tables.
    • To mitigate downtime during migrations, he suggests several strategies:
    • Allow null values initially to bypass lengthy updates, delay large migrations to off-peak times, and consider database capabilities.
    • Implement gradual updates by executing separate migrations first and running long processes (like data updates) in the background without user impact.
  • Handling Background Workers:

    • Background workers are addressed as always containing jobs during deployments. Developers should prepare for existing jobs prior to changes in worker code formats, ensuring backward compatibility.
  • External Services Management:

    • When deploying new services, Duncan emphasizes the importance of versioning APIs to handle simultaneous requests from different application versions effectively. He encourages gradual roll-outs to maintain stability and minimize issues during deployments.
  • Best Practices and Conclusions:

    • The importance of having distinct processes for application and database migrations is discussed. He advocates for a push-button deployment approach, utilizing an automated and strategic rollout process to minimize user disruptions.
    • Duncan concludes his talk by inviting discussions on how to improve migration strategies and enhance developers’ experiences while deploying applications.

Throughout the session, Duncan highlights the need for comprehensive planning and foresight when deploying changes, illustrating the complexities of database schema changes, background jobs, and working with external services. The emphasis is on maintaining operational integrity without sacrificing user experience, which involves strategic planning and careful execution.

In summary, this talk serves as a valuable resource for developers looking to enhance their knowledge of deployment strategies, particularly in Ruby on Rails environments.

Zero Downtime Deploys Made Easy
Matt Duncan • Denver, Colorado • Talk

Date: November 01, 2012
Published: March 19, 2013
Announced: unknown

Every deploy introduces the risk of downtime because of changes which are not backwards compatible. At Yammer, we deploy our Rails codebase to hundreds of servers many times a week. In this talk, I'll discuss many of the strategies we use to mitigate downtime during deploys. This includes how we handle database changes as well as background workers and external services.

RubyConf 2012

00:00:15.280 perfect uh everybody um I'm gon to be talking about thanks
00:00:20.600 whoever said that thank you uh I'm gonna be talking about zero downtime deploys and how to make them easy uh so real
00:00:26.960 quick I'm Matt Duncan I work on the rails team at Yammer uh I also have a huge confession to make
00:00:32.599 which is that this is not really a simple problem at all and there's really not a super easy way to do this uh but
00:00:39.280 it turns out if you say there is uh people will come to your talk um so really I'm just going to like depress
00:00:45.120 the hell out of everyone and uh show you the horrible workarounds and then
00:00:50.239 hopefully we can all kind of figure out a better solution so there is no Silver Bullet sorry um also this is like a
00:00:58.320 really broad topic so I'm going to kind of focus on anything from your framework
00:01:03.760 down uh so I'm going to ignore web servers I'm going to ignore your network stack anything beyond that uh I'll
00:01:09.840 assume that you know how to deploy those uh without downtime you may or may not
00:01:16.680 um also this isn't just a problem with rails uh I'm going to be kind of talking specifically about rails uh just because
00:01:22.119 it's probably most familiar to everyone um and in addition active record but
00:01:27.759 it's really a problem with uh pretty much any framework um so yeah uh three
00:01:34.920 topics uh in general which is how to do database migrations uh how to make
00:01:40.799 changes to your database while not taking your site down um how to make changes to your async workers um stuff
00:01:48.520 running in the background how to make changes to the way you cue things up um
00:01:53.640 and also how to handle external services so basically anything other than your database that is
00:02:00.520 uh used by your application so first let's start with the database uh it's kind of probably
00:02:06.600 the most familiar to everyone um and we're going to walk through a simple example um let's say we have this
00:02:14.160 awesome site it has tons of users and we've decided that we want to be able to make users administrators uh we want
00:02:20.560 some of our users to see different features than others um so that's easy right we'll just make a new migration
00:02:27.120 We'll add a admin column the table uh it'll default to false obviously because
00:02:33.080 we don't want everyone to be an administrator and we will make it not null because being n it doesn't really
00:02:39.280 make sense in this case so uh we'll push that code out we will run our
00:02:45.560 migration and we're going to notice a couple things immediately one is that that migration is taking a really long
00:02:52.159 time to run uh that's bad also uh if we have really good monitoring we're going
00:02:57.840 to notice something else which is that our site is actually down now um so
00:03:03.040 what's what's going on what's Happening um well all of our web processes are
00:03:08.319 just hung right now uh they are stuck so let's hop over to the database uh this
00:03:13.760 is how you do it in postgress there's similar ways to do it in basically any database you're using um but this
00:03:20.840 basically will check to see what has granted locks on tables uh granted
00:03:26.400 exclusive locks so any lock which will not allow reads or rights now obviously
00:03:32.959 uh our migration is actually still running because it's just taking forever and we're going to see that the uh the
00:03:40.439 uh table change that we were making is the command that has the users table
00:03:46.080 locked so why did that happen well migrations are transactional
00:03:51.319 um in this case we're actually doing two things even though it looked like we were only doing one um we're doing one
00:03:57.319 thing which is really fast and really easy and we're doing one thing which is not as easy necessarily um all in that
00:04:04.439 one command so what the database actually needs to do is it needs to add the new column that's easy that's fast
00:04:12.040 uh if it's not you should find a new database probably um the other thing is it needs to go through every single Row
00:04:18.639 in the database and update it uh because we said it can't be null here's the
00:04:23.919 default so it needs to go through and write that default value in every single row um so the larger your table the
00:04:30.440 longer it's going to take um now how do we how do we get around this
00:04:37.120 well you could probably just do it at off peak times right uh let's wait until we have less users on the site uh so
00:04:43.960 we'll do it like Friday night late um your traffic graph if you have a
00:04:49.120 reasonably popular site though probably looks a little bit like this uh that big peak in the center is probably weekday
00:04:55.320 traffic those dips are weekend traffic uh if you have a different type of s it may be reversed where you get more
00:05:01.360 traffic on the weekends in any case notice how that Valley doesn't quite touch zero there uh it gets lower for
00:05:08.560 sure but it never actually touches zero uh so anytime you do this you're going
00:05:13.720 to affect real users um so there's a trade-off involved here right like we
00:05:20.080 can do it uh while people are while not as many people are using our site but
00:05:26.120 we're still going to affect them that may be okay because it was really easy to write that simple migration and run it
00:05:32.520 um the other thing we can do is just get a faster database right throw some money at the problem that's that's always a
00:05:38.880 simple solution uh that'll help right um that means we can actually do more of
00:05:44.160 those updates uh during that transaction before users start to actually notice eventually though you're going to
00:05:51.600 be working with tables that are hundreds of millions or billions of records and just throwing money at the problem is
00:05:59.000 not really feas uh eventually you have to start throwing money at people who need to come in and
00:06:05.880 do weird things and it's yeah weird things happen um so let's walk through
00:06:11.360 how we could actually do this uh without throwing money at the problem we'll just throw a little bit of time at the problem
00:06:17.120 instead uh so in this case we'll do almost the exact same thing in our migration uh we're going to make one
00:06:23.720 simple change which is we're going to allow null values so this lets us bypass that entire hard part of the uh table
00:06:31.639 change and just do the fast part which is adding that new column so right now
00:06:37.880 every single record in that uh table is going to be null that's fine we're not using it yet uh the next thing we're
00:06:44.000 going to do is write a quick task that will do the hard part for us so the
00:06:49.319 reason we're doing this is so that we can uh basically use some very small locks while we're uh doing the updates
00:06:57.400 so in this case we're just going to update single record uh which has a null value for the admin and just change it
00:07:04.319 to false um and this can run behind the scenes you can let it run for as long as it takes it doesn't matter it's not
00:07:10.680 going to cause any significant load on your database uh you're not going to even notice it's running probably uh
00:07:16.960 unless you have a really crappy database in which case throw a little money at the problem um so we'll push that code out
00:07:24.120 we'll run our migration uh this time we'll notice something awesome which is that it ran really really fast that's
00:07:30.960 good uh then we'll go ahead and kick off our task can run that behind the scenes
00:07:37.240 um you can start it up in a screen session on one of your servers and just let it run um you can also get more
00:07:44.159 creative with it uh for example at Yammer we actually have some tools which let us uh parallelize these types of
00:07:50.240 things so that we can run multiple at the same time really easily um and
00:07:55.319 really simply uh so once that's done we can actually go back and change our table back to uh having that non-null
00:08:02.360 constraint so this time all we need to do uh or all the database needs to do is just verify it needs to do a basically a
00:08:08.960 quick table scan to make sure that there aren't any null values uh if there are it'll update them to the default if
00:08:15.520 there aren't it's done basically um so it's super quick uh it's basically as fast as your database can do a
00:08:21.479 sequential scan so again we'll push that code up we'll run our migration and awesome site
00:08:29.720 still up also wow that was a lot of work right uh turns out this is actually just
00:08:35.719 the tip of the iceberg um but like I said uh this talk is all about trade-offs um in a lot of cases it's
00:08:43.680 going to be worth the effort to do all of that work because it will keep your sight up in a lot of other cases it's
00:08:49.519 not going to be because as you can see it was a huge pain in the ass um so yeah
00:08:55.839 uh migrations are also not the only place that can cause these issues uh this is actually apparently the iceberg
00:09:02.880 that is thought to have sunk the Titanic uh very innocent yet uh does big damage
00:09:10.920 um so yeah be careful um so so let's kind of walk through the rest of the
00:09:16.279 stuff that can happen inside the database um as we were just talking about long database locks are a big
00:09:23.240 problem um they can happen in these are the two biggest cases where they'll
00:09:28.360 happen uh when you are adding non-null constraints with default values to a table um also when you're adding indexes
00:09:34.920 uh indexes need to lock the table in most cases uh if you're using postgress Create the index concurrently
00:09:41.839 uh it'll run behind the scenes and then it'll just switch into being used if you're using MySQL just switch to
00:09:49.200 postgress uh or there are tools that you can use but they're probably harder to use than
00:09:55.680 switching to postgress so uh yeah anyway um um the other case is out of sync
00:10:01.600 schema so this is actually an interesting one because uh what happens
00:10:08.680 is your application thinks that you have one schema and your database knows that
00:10:13.920 you have another schema um so how many of you ever seen an error like this before anyone no one thank you a lot of
00:10:21.600 you actually awesome um yeah so so what happens here well we have removed a
00:10:27.399 column from our database but our application is still doing this it's still sending it along uh so why is that
00:10:34.120 happening well when active record loads a model uh it asks the database for your
00:10:39.200 schema right um that's why you don't have to specify the schema inside of every model that you write uh you know
00:10:46.880 don't repeat yourself right um the problem with that is uh if the schema
00:10:52.440 changes active record doesn't actually go through and update them it would have to pull or something and that's just
00:10:58.560 kind of painful um so yeah uh that's that's one
00:11:05.399 problem um here's here's kind of the most common cases uh renaming renaming columns renaming tables uh try to avoid
00:11:13.399 renaming tables it's probably not worth the effort um and removing columns removing columns is uh obviously the
00:11:20.200 most common one um you have a column that you don't really need anymore you don't want to leave it around because
00:11:25.800 then it just stays forever basically um so let's walk through the process of getting rid of that column
00:11:33.000 without again taking the site down uh so three steps uh we're obviously going to
00:11:38.959 start off initially writing to that column uh so next step is to stop
00:11:44.680 writing to the colum uh how do we do that well first thing we need to do is tell the database that it's cool if we
00:11:50.720 don't write to the column so tell the database that it can have null values in the
00:11:56.560 column uh the next thing we need to do is actually tell active record to ignore
00:12:02.200 that column uh so it's relatively simple um which is we just override the or
00:12:08.920 Define the uh columns method here and just ignore the column that we're getting rid of so when active record
00:12:16.399 loads the schema for that table uh it's going to just skip that
00:12:21.639 column and it's if as if it never actually existed um and this is all before we run our migration so now we
00:12:29.639 want to remove the admin column we will remove the admin column um oh yeah and
00:12:35.440 then we also need to go back and clean up all of the stuff that we just added to our users model um so all
00:12:44.199 of all of this code here uh we can actually just get rid of because we don't need it
00:12:50.320 anymore uh oh yeah uh your and couch and Lotus Notes won't solve these
00:12:56.680 problems either um this they will like let you think about data
00:13:02.079 and schemas in different ways um and that may potentially be useful actually um but they're definitely not going to
00:13:07.480 solve these problems uh I didn't have enough time to go into um the same
00:13:12.600 problems with those uh but hopefully uh hopefully the lessons
00:13:18.600 transfer all right so moving right along uh background workers uh stuff that runs
00:13:24.279 behind the scenes stuff that you dump into a queue and then have jobs then you have workers pick up up and work um
00:13:30.920 these are actually fairly simple uh if you just keep one thing in mind cues are not going to be empty um whenever you
00:13:37.920 deploy code whenever you deploy your workers uh the cues are going to have stuff in them uh just make that
00:13:44.240 assumption there are cases where they probably won't uh and you could purge them if you really wanted to but that's
00:13:50.519 probably a bad idea so just make the assumption that they won't be empty and that there will be jobs in there from
00:13:55.720 the previous code so uh if you're changing the format of your messages uh
00:14:01.079 so you're adding a new parameter make sure you handle the case where that parameter doesn't exist um and the param
00:14:07.759 and the case where the parameter does exist once all of those messages have flowed through and you know it's clear
00:14:14.199 uh you can then stop handling uh that previous case um again if you're getting
00:14:19.279 rid of uh workers so you have something that you don't really need anymore you probably just need to go purge the queue
00:14:25.040 or leave one of the workers around to kind of finish off running those processes
00:14:31.000 all right so so those were the those were the easy cases uh let's move on to the more interesting one which is
00:14:37.320 external Services um and I'm going to talk about Services inside of your
00:14:43.480 company but first I'm going to get some
00:14:48.680 water um so I'm going to talk about Services inside your company uh primarily because it's easier to
00:14:56.040 rationalize about them um you control their entire life cycle so uh first thing version them um
00:15:05.120 doesn't really matter how you do it uh you can use URLs you can use headers you can use whatever weird format you want
00:15:12.360 uh just make sure they're versioned um so let's let's walk through like a Ideal
00:15:18.279 World scenario and then let's shoot holes in it uh so the ideal World scenario is you have an application it's
00:15:25.600 using the first version of your API uh you deploy a new version and you start
00:15:30.839 writing to it you don't read from it yet but you start writing to it uh so the reason you would do that is so that you can actually just both you can do a lot
00:15:38.560 of things actually um you can start doing validations on the data uh to make sure that your new version is actually
00:15:45.440 doing what you expect it to be doing and that the rights match what the first version was doing uh you can also make
00:15:52.000 sure it can handle your production load um and you can do any backfilling of
00:15:57.399 data into that version if you need to uh let's say they may be potentially running on different uh data
00:16:04.160 sources so then eventually you can switch your uh write or your reads
00:16:09.480 excuse me um to your new version and you can continue writing to your first
00:16:15.000 version if you want to or not uh one of the nice things nice things about continuing to write to that version is
00:16:21.360 that you could always fall back to it if something goes catastrophically wrong um so it's kind of a little safety net uh
00:16:28.040 but eventually obviously you'll move off of it so uh one thing that you should have
00:16:35.639 uh in mind uh one thing that is super useful is the ability to uh switch these
00:16:42.440 things around uh to switch them on and off uh the reason for that is when you're deploying um things are actually
00:16:48.240 going to look a little more like this uh you're going to have some servers that are uh have old code still running um
00:16:55.800 this is mid deploy you're going to have old Ser or servers still have old code running uh which are reading from your
00:17:01.160 first version and some servers which have the new code deployed which are reading from your second version um so
00:17:09.640 having a switch in place that lets you more automically uh make that transition
00:17:15.439 is really important um oh yeah and don't forget to deploy your services in the exact same
00:17:21.520 way that we've been deploying everything else because these same issues apply um
00:17:27.600 I've I've been talking a lot about uh your kind of main core application uh and how it interacts with uh the
00:17:34.840 database and other services but obviously this like flows all the way down into your services and their data
00:17:40.520 sources and then their services and their data sources and on and on um so yeah
00:17:50.160 uh what what happens if you can't um well you know uh give yourself a way to
00:17:56.720 turn services off um the ability to just flip the switch on a service uh so let's
00:18:03.679 say for example your search um the ability to just remove that search bar um for you know 10 or 15 minutes while
00:18:11.400 you're deploying that new version you can actually just take that thing out of production uh do things nicely uh do
00:18:18.039 things quickly not have to do that whole migration dance that we just saw um and then just flip it back on again and
00:18:24.039 users May notice they may not um but your site isn't going to be down and your users AR see errors they're just
00:18:29.640 not going to see the full features that they might have before so uh let's let's talk about uh
00:18:37.480 what we can learn from this um hopefully not everyone is super depressed yet um anything can go wrong uh there
00:18:46.520 are like so many ways that this can go wrong um I skipped a lot of them um so
00:18:52.520 for for example uh you know you roll out a new validation for users um
00:18:59.760 I have loaded the signup form enter in some data you deploy the site and then I
00:19:04.880 hit submit oh all of the valid data that I just submitted is now invalid because
00:19:10.559 the logic on that is totally changed uh so I see errors and I get really annoyed
00:19:16.080 and I never sign up um not everything is worth the effort though uh cases like that are rarely
00:19:24.240 worth the effort in handling uh sometimes they are uh maybe sign up as a
00:19:29.520 case where it is worth the effort uh I would argue that it's probably not worth
00:19:34.840 the effort to handle um the forms case uh in most
00:19:45.640 cases um also make your deploy simple and fast uh if you don't have a push button deploy you should do that uh this
00:19:54.640 is kind of what it looks like to deploy stuff at Yammer uh you literally pick what you want to deploy where you want
00:19:59.960 to deploy it and just hit deploy um and to fit into the last talk we have
00:20:06.400 metrics uh so we can actually see how many times things get deployed um and the cool thing is like the easier
00:20:13.280 something is to deploy the more likely you are to deploy it right um the other thing you should do
00:20:20.720 is separate your migrations from your deploys uh you should think of them as totally separate things uh deploying
00:20:27.440 migrations are basically database deploys uh deploying
00:20:32.559 your application is an application deploy uh you probably won't be deploying those at the same time you
00:20:38.080 probably shouldn't be deploying those at the same time um so I'm G to get some more
00:20:45.360 water it's a very dry city
00:20:50.679 um so so the way we used to deploy ammer uh when I first started uh and you can
00:20:56.720 tell the ammer employees in the room because they will start laughing as I tell the story uh was that we all like
00:21:02.240 crammed into this room and it was really hot and really sweaty and we would drink
00:21:07.600 because we were really afraid that we were going to take the site down uh and we would run the migrations and then we'
00:21:13.240 play a lot of really loud music and it looked a little like this and it was
00:21:18.320 really terrible um and we would basically frantically run the migrations and then deploy the site as quickly as
00:21:24.440 we could because we knew the site was probably down uh because of all the things that we just talked about um and
00:21:31.240 we probably all lost a few years off of our life due to stress um yeah also roll
00:21:36.919 out your services gradually um one of the things we do whenever we uh roll out a new service at Yammer is that we roll
00:21:44.360 out Services really slowly um we'll put maybe 5% of our traffic onto them and
00:21:49.840 then kind of bump that up to 10% and then if that looks good maybe 20% maybe 50% um this gives us the advantage of
00:21:58.679 kind of forcing us to think of the graceful uh scenario when we need to
00:22:04.880 degrade um and it gives us the ability to turn things off if we need to um so
00:22:11.600 this this is an example that I found uh we we have our own internal tool but
00:22:16.720 this one looked pretty awesome uh it has exactly what you need uh in a tool like this uh which is that you can roll it
00:22:23.679 out to a certain percentage of users or uh of requests or whatever um you can
00:22:30.919 pick groups so for example roll out services to uh inside your site first um
00:22:38.039 or excuse me inside of your company uh so that you can dog food those Services before they hit production uh especially
00:22:45.400 make sure your CEO has access to these because he will be the most likely to complain about them if things go wrong
00:22:52.400 and things will get fixed much quicker uh if he is the one seeing them um but
00:22:58.520 you can also add in specific users uh if you want to give access to a few users uh say they're beta users or whatever
00:23:07.400 um yeah uh so I'm gonna I'm gonna go to the question slide slightly differently
00:23:14.159 than most people uh which is I'm GNA ask you guys some questions uh which is like how do we make this easier uh this is
00:23:21.360 really annoying to have to do all that kind of weird stuff um how how could we build like databases
00:23:28.520 that make this easier how can we build Frameworks that make this easier uh yeah
00:23:33.919 um I can answer some questions maybe yes here a
00:23:42.600 mic have you looked at uh chunko which is written by cookpad to do rolling out
00:23:48.320 features to parts of users to uh no uh explain a little bit more about it um
00:23:54.880 chenko is a allows you to roll out a a feature to a certain set of users or a
00:24:00.400 certain percentage of users and it's all baked into rails and provides a a framework to do part of it interesting
00:24:06.679 yeah yeah yeah that we we have our own internal Tool Set uh but yeah that that is like exactly the type of tool that uh
00:24:13.799 I would recommend using um which is the ability to roll out to both certain percentage users and also pick users
00:24:21.840 that can get into that rollout group um it's always nice to be able to force users into a group um both seeing
00:24:28.640 something and not seeing something uh yeah
00:24:43.240 absolutely um I recently heard of a technique I thought was rather interesting which was if you're going to
00:24:50.320 be modifying a table and upgrading your code to the
00:24:55.360 table um for almost I think zero down time you
00:25:00.760 just uh cause a copy of the table to go with the new schema element in
00:25:06.440 it and Meanwhile your old code is still working on the old table and then you
00:25:13.320 when that's done you then deploy code that talks to the new table
00:25:18.799 and then uh a final little task to bring up anything that's in the Delta between the
00:25:25.120 old and the new table and then you're you're kind of running and then when you're confident you can drop the old
00:25:30.679 table yeah just yeah that's that's an interesting strategy one of the I guess one of the downsides would be that you
00:25:36.159 have that like Delta of data between those um but if that's something that you can live with yeah that's absolutely
00:25:41.720 a great way to handle that um and it totally avoids the uh annoyingness of
00:25:47.279 like all of those steps um
00:25:53.840 yeah anyone else I saw
00:26:05.799 I'm interested how you manage the complexity of like sort of the the multi-step migration deploy that you
00:26:12.320 talked about at the beginning while you're striving for like simple push button deploys like that first part has
00:26:18.360 to be a manual process right yes it is a manual process uh poorly is the answer to that
00:26:24.320 um the so so when we when we look at uh the push button process that's all
00:26:29.399 application uh level push button process the database stuff is not quite as
00:26:35.000 awesome um we're looking to make it better uh
00:26:40.279 our our site stuff used to be not push button as well uh it's become push button um yeah uh
00:26:48.200 poorly uh yeah I I I I don't have a great solution for that uh the way the way that we normally do it now is we'll
00:26:54.559 kind of run one migration at a time uh manually usually there aren't a ton that that becomes
00:27:00.399 unreasonable um and it also gives us a way to kind of vet things that are
00:27:08.279 shipping uh to answer the question of one way we can actually do it easier is
00:27:13.559 um don't remove your tests until after your code is out um make sure that all
00:27:19.399 of your old tests are passing yeah so that you don't run into that that weird migration window yeah yeah absolutely
00:27:31.039 hi I'm just curious how you enforce this migration policy among all the engineers
00:27:37.080 right see it would seem like a complicated thing because yeah if there are multiple
00:27:42.200 steps you know a new guy is coming in he doesn't remember that step everything uh
00:27:48.039 poorly uh no uh yeah I mean it it's it's
00:27:53.159 a it's kind of a human management problem right like it's a it's something
00:27:58.679 that uh all of the engineers basically just kind of need to be on the same page about
00:28:09.480 um uh I would say trust your engineers uh you should be able to trust
00:28:16.440 your engineers
00:28:24.080 uh uh yeah I I I mean I I would still separate that into the like make your
00:28:31.399 database deploys separate make your database schema changes separate from your application deploys um for example
00:28:38.760 like creating a table is perfectly safe like you could do that at any time um it's not going to affect your
00:28:44.600 application at all unless you're doing something really weird uh but yeah uh I
00:28:52.039 mean it's it's hard uh it's it's basically a human
00:28:58.240 problem uh a kind of like Collective understanding problem um just kind of getting everyone on the same page uh
00:29:05.240 having having people kind of like ultimately responsible for uh the deployment of those migrations uh does
00:29:10.919 help um because there are usually other things that we need to look at uh when we're for example adding or removing
00:29:17.799 stuff uh for example our analytics team may be using uh some of that data and they need to know to upgrade their
00:29:24.039 scripts uh to get rid of those columns as well
00:29:36.080 um etsy's done some talk on using code to do defaults versus using the database
00:29:41.320 to do defaults like Auto incrementing and what have you guys experimented with that to find so you're not locking your
00:29:47.760 tables when you're doing n value uh we have not really um I I would be
00:29:52.799 interested to try it though uh the the database constraints are nice
00:29:58.880 um but yeah I mean we we arguably don't get a lot of benefit out of them uh just
00:30:04.679 because we have a single applications writing and reading from the database uh if you have more than one application
00:30:11.840 it's kind of less useful to have those uh defaults in each application uh or I should say it's more useful to have them
00:30:17.519 in a centralized place um but yeah I it's it's definitely an interesting idea
00:30:24.000 that we haven't really explored very much um yeah uh let's do one
00:30:31.919 more basically what we're describing here are uh transitions between legal states of the production environment um
00:30:38.960 yeah and uh so are you familiar with anyone who is exploring um uh describing
00:30:45.000 those transitions at a higher level where we can say uh that I'm I'm starting at a state that has these known
00:30:52.240 constraints on it and I want to apply this set of forward transforms it kind of like we do with migration but at a
00:30:58.720 higher level yeah at a at a more like operational level you're saying um
00:31:04.760 vaguely yes uh I I don't know of any I'm not aware of any uh good way to do that
00:31:10.639 or kind of simple way to do that uh arguably this wasn't very simple either um but it is kind of more familiar to
00:31:18.440 people uh yeah I I would definitely be interested in learning about them though yeah uh all right cool thank you
Explore all talks recorded at RubyConf 2012
+46