Summarized using AI

Rails services in the walled garden

Niranjan Paranjape and Sidu Ponnappa • September 29, 2011 • New Orleans, Louisiana • Talk

The talk "Rails Services in the Walled Garden" presented by Niranjan Paranjape and Sidu Ponnappa at RubyConf 2011 explores the implementation of service-oriented architecture (SOA) using Ruby on Rails. It addresses the intricacies and challenges faced when building multiple services within a business environment, referred to as a 'walled garden'. This approach focuses on creating independent, evolving services rather than relying on monolithic applications. The speakers draw on their experience working with a large-scale project consisting of more than nine Rails-based services for a major hosting provider, detailing various problems encountered and solutions implemented. Key points discussed include:

  • Advantages of SOA: Services allow for independent development and deployment, better scalability, and facilitate integration across business verticals while maintaining localized control over API design. These benefits streamline workflows and reduce complexity across teams.
  • Challenges of SOA: The talk emphasizes some challenges, such as increased chatter between services leading to performance bottlenecks, difficulties in implementing authentication and authorization across services, and complications with database transactions in distributed environments.
  • Authentication and Authorization: The importance of a centralized authentication system to enable users to access multiple services without re-authenticating is discussed, alongside the implementation of OAuth2 as a robust solution. Centralized roles versus federated rules for service authorization are examined.
  • Performance Optimization: The speakers encourage implementing performance builds to monitor HTTP request times and improve service communication efficiency. Caching strategies, like server-side, client-side, and intermediary caching, are proposed as methods to improve response times.
  • Pagination in APIs: Efficient pagination is discussed, highlighting the need for presenting manageable data chunks in API responses. The ‘box paginate’ solution was introduced to make pagination easier when using Active Resource.
  • Event-Driven Architecture: The benefits of using event systems and message queues (e.g., RabbitMQ) for decoupling services and enabling asynchronous processing of events are explored, showcasing how these can replace problematic direct database integrations.
  • API Versioning and Transactions: Suggestions for handling API versioning and maintaining backward compatibility are provided alongside discussions on the complexities of managing distributed transactions.

The presenters conclude with considerations for standardizing coding practices across services, advocating for the use of tooling such as Chef or Puppet for configuration management. The overall takeaway from the presentation stresses the importance of designing robust, maintainable, and efficient API-driven services while navigating the unique challenges posed by service-oriented architectures in a corporate setting.

Rails services in the walled garden
Niranjan Paranjape and Sidu Ponnappa • New Orleans, Louisiana • Talk

Date: September 29, 2011
Published: December 12, 2011
Announced: unknown

In typical service oriented architectures, monolithic applications are sliced along domain verticals to create several independently evolving 'services' that can be used in combination to achieve various outcomes. Rails applications lend themselves to this architecture beautifully and are slowly making inroads in big organisations for this reason. One of the big problems with this approach is that analyzing and managing large quantities of data from multiple services to produce a result becomes very hard. What was originally a relatively simple task when all data sat in the same database as part of a monolithic application, becomes, when split into multiple services, a whole different beast. This talk will focus on our experiences building a system involving about a dozen rails based services integrated over HTTP and XML and the issues we had to deal with when working with large data sets. We will talk about: * Various problems we faced while building RESTful APIs which demanded asynchronous communication between two services * Places where this asynchronous communication was needed to be initiated by the server, essentially a push mechanism instead of pull * Different approaches to this solution ** Sharing a database using read-only 'remote models' ** Creating read only local caches at each consumer *** Propagating updates by having the producer explicitly update each consumer *** Propagating updates using a message queue and the pros and cons of integrating with them in Ruby

RubyConf 2011

00:00:16.960 hi everyone thank you for uh thank you for coming here this evening um uh i'm actually surprised to see so many
00:00:23.359 people at five in the evening uh last session but yeah thank you for coming here our talk is going to be
00:00:29.359 about rail services in the world uh world garden there are two of us presenting here uh for this this is niranjan my
00:00:35.840 colleague um he's akamiya on github and niranjan p on twitter and uh this is me sidhu i'm kiran on
00:00:43.200 github and pon up on twitter we both work at c42 engineering
00:00:49.280 this is a small ruby consultancy based out of bangalore in india
00:00:54.640 and let's get started with a little bit of background about this stock just to set some context
00:01:01.039 so this talk is based primarily of our work building a suite of nine plus
00:01:06.320 services at this at a fairly large business these services were a bunch of rails
00:01:11.600 applications that today automate uh entire data centers this is one of the largest hosting companies in the world
00:01:17.040 so it's probably one of the biggest uh deployments of uh of rail services that
00:01:22.560 i i've had a chance to work on we've built a bunch of other stuff but this was the
00:01:28.479 most interesting because we were replacing a monolithic legacy application
00:01:34.479 that was causing a lot of pain to the business and was pretty old and was mission critical since then we've we've done other stuff
00:01:40.960 we've but a lot of api work but i don't think any of them have come close to the the scale and complexity of that
00:01:47.119 particular project but we've done a bunch of stuff with two apis three apis uh in that scale so uh
00:01:53.439 almost without intending to we became something of uh a little bit of a focus group for uh for apis back home
00:02:01.360 um so this talk is about rails and service oriented architectures and in the wall garden so let me talk about
00:02:08.160 uh you know these things uh in in bits and pieces as to you know break them down and talk about why they're relevant
00:02:14.800 um but i think the single most important thing here is why do we say walt garden now by
00:02:20.879 wallgarden i mean services that run within a business which are not being consumed by people
00:02:26.720 outside the business this allows us to have a certain amount of leeway in how we structure and write these apis so
00:02:34.080 that we save on multiple things we save primarily on effort and cost we don't have to go the full rest hate
00:02:40.640 os full-blown consumer api way because we have far more control
00:02:45.840 so these are the assumptions we make and there is a certain structure to
00:02:52.400 to this talk as well the stock is going to be structured around the problems and the pain points we ran into when
00:02:58.239 building apis inside the walled garden so i think the first question is
00:03:04.400 why so okay so let's uh just quickly summarize the advantages of using soa versus building
00:03:12.319 a monolithic application first of all it allows us to break our application as per multiple business verticals we have
00:03:18.800 in our organization which where individual service provides an api to consume by that vertical
00:03:25.040 whereas it also provides integration points for other services to provide organization-wide workflows
00:03:31.519 every application rather every service is a self-contained which means that whatever development i'm doing in that
00:03:38.000 service doesn't cause ripple effects to all other services across the board as long as my api remains constant the way
00:03:45.040 it helps is whenever we are building a complex workflow for a particular vertical we don't have to worry about
00:03:51.440 all other business domains which we don't understand or don't care about essentially it means that we can
00:03:57.360 independently evolve a particular vertical by allowing other verticals to integrate with it
00:04:02.799 it also allows us to independently deploy a particular service if say for example if i'm working on one
00:04:08.640 particular business vertical and i'm done in two weeks but there is this other particular service which is being
00:04:14.959 developed and if it takes four weeks i not wait for four weeks before i can deploy my service i just have to make
00:04:20.239 sure that i'm not breaking the api and i can push things to production as soon as i'm done with them
00:04:26.800 now it also allows us to monitor what services are being hit more frequently
00:04:32.720 and scale those services instead of scaling the entire suit of services they are easy to maintain
00:04:40.639 every like if codebase is small of course scorebase is independent the code quality is generally higher
00:04:46.080 the bigger your code base get messier it becomes and more broken windows it will have so definitely it helps
00:04:52.400 we all know that smaller teams building working on individual code bases is much better than one massive team of say 8 10
00:05:00.240 15 or 20 people working on a shared code base so these are all advantages and list
00:05:06.880 continues but there are some disadvantages of building
00:05:12.320 service oriented architecture so what are these disadvantages first and foremost services like to talk
00:05:17.680 a lot they talk to each other and this essentially can lead to performance bottleneck because every service call
00:05:24.080 you are making comes up with its own http and framework over overheads which
00:05:30.800 we cannot avoid we have to provide some sort of transparent authorization and authentication across all services we
00:05:37.360 cannot just go with authorization for service because we want to implement workflows which span multiple services
00:05:44.880 it is difficult to implement database transactions like asset transactions across distributed databases when it
00:05:50.960 comes to distributed services it's even bigger problem because we have to manage it over http so
00:05:58.319 then anytime we are building talking about building services we always talk about how we are going to version the
00:06:03.440 api and how to make how to make sure that my apis are backward compatible if there
00:06:08.479 are old clients they should still talk to they should be able to talk to new services
00:06:14.000 and last but not least continuous integration like making sure that there is a continuation between integration
00:06:20.160 build running across multiple services which does actual integration testing by spending up all the services
00:06:25.919 and making um a making calls which span all those workflows is difficult
00:06:34.160 okay so uh we've we've gone through some sort of a justification for why service
00:06:39.919 oriented architecture is important but uh why is why is rails important in this context well frankly it's the most
00:06:45.759 awesome thing i've seen to create apis it's ridiculously easy it gives you powerful routing i'm sorry i was about
00:06:51.759 to say rooting i'm indian but powerful routing which means that it is probably the most
00:06:58.240 convenient thing that i've seen to map http requests to a block of code which then creates a resource for you it gives you
00:07:05.039 again what's probably among the best mime type negotiation frameworks out of the box like if you try to do this with
00:07:11.360 something else it's not that trivial so dealing with different kinds of representations content types is really
00:07:17.120 easy with rails you have less boilerplate code i mean there is still some but if you've ever seen a spring code base believe me rails is so much
00:07:23.680 better and then there's the bottom line which is that ruby is nice i mean given a choice between writing code and ruby
00:07:29.440 and writing code in java there's no reason why i wouldn't pick ruby but this begs the question what about the other
00:07:34.960 popular frameworks what about sinatra what about pedrino what about uh you know other similar frameworks now
00:07:40.560 this talk doesn't discount those much of what we're going to cover is relevant to those frameworks as well
00:07:46.000 it's just that rails is extremely popular and gives you a lot of the things that you need out of the box so we're going to talk about rails but
00:07:51.919 most of what we're covering is just as relevant for sinatra and friends as it is for rails
00:07:57.599 and uh walled gardens now let's let's uh go a little further into what i mean by
00:08:03.039 wall gardens uh beyond my original explanation uh it's the first first premise that i'm i'm
00:08:10.000 making here is that being inside a walled garden is easier you don't have thousands of consumers you don't have
00:08:16.400 your apis being consumed in unpredictable ways you have full visibility into the evolution of clients that consumer apis
00:08:23.039 because this is all within your organization it's easier to form a roadmap now what this means is that you can
00:08:28.319 avoid certain costs that you otherwise have no choice but to incur with with public general consumers
00:08:35.039 i am also going to be using certain rest related terminology here which uh you know if if it's unfamiliar for whatever
00:08:42.159 reason we can talk about in the qa but i'm just going to make a certain assumption here that uh it's you know people are familiar with it in
00:08:48.399 passing so one thing is that if you're going to try to build a full head os architecture it's extremely expensive
00:08:53.920 it's expensive in terms of implementation effort it's expensive in terms of performance and quite frankly i
00:08:59.120 haven't seen a full you know richardson maturity model level three api in the
00:09:04.640 wild yet i'd love to see one if anybody's run into one and can point me at one that would be awesome but i haven't seen one i've tried to build one
00:09:11.279 it's pretty hard so there is no reason to build a full head os architecture within uh the world garden heat os
00:09:17.440 exists to solve a lot of issues that crop up when consumers are using your apis rails does riches and maturity
00:09:23.519 model level 2 2.5 somewhere between two and three pretty much out of the box you can get two just like that with a little
00:09:29.519 bit of effort you get a little beyond that and this is free now richardson maturity model for those of you that are not familiar is a model which tells you
00:09:36.880 the levels of restfulness of apis and the highest level is a three so rails gets pretty close for free out of the
00:09:43.279 box and while we're talking about this let me quickly click you know call out this one glaringly obvious thing which is the
00:09:49.360 trails is not rest there is like restful routes uh is marketing like having rest
00:09:55.120 like restful routes doesn't mean your api is restful in the least using rails again doesn't mean that your api is just
00:10:00.560 full it does take a certain amount of work now as i mentioned earlier this talk is broken down into a series
00:10:06.480 of areas of interest that we found were bottlenecks while developing services and that's what we're going to structure
00:10:12.640 this talk around going forward this was the background a little bit of context and now we're going to start digging
00:10:17.839 into problems and solutions so first off we have yeah so uh first and foremost
00:10:22.959 thing let's talk about authentication whenever we're building any services we do want to
00:10:28.000 guard those services make sure that only certain set of people are able to consume those apis or
00:10:34.320 whatever now interesting thing is when we are building authentication across services
00:10:40.720 we need to make sure that we are implementing it in such a way that once i log into one service
00:10:46.480 i can actually go through a workflow which spawns which spans multiple services without re-authenticating
00:10:52.640 against a server or different services
00:10:57.760 now when we are building restful services obviously these authentication whatever authentication mechanism we
00:11:03.040 come up with has to be stateless which means that i cannot use cookies which raise generally uses for
00:11:09.839 managing authentication now the quickest simplest and cheapest way to get this authentication done is
00:11:17.279 just put a firewall in front of all your apis get a range of ips which can access those apis and you're kind of done make
00:11:23.360 sure that every consumer is sitting within those ip ranges and no one else will be able to access those apis
00:11:28.640 anymore and it will work in a small scenario but as your services
00:11:34.560 grow more and more complex you want a better solution what you want is essentially some sort of centralized
00:11:40.079 authentication system so that user can actually authenticate against that central authentication system and then
00:11:45.440 start using apis which are behind authentication firewall now there are multiple services which we have already
00:11:52.399 consuming which are doing this say for example facebook does it github does it twitter does it they all allow us to
00:11:59.440 actually authenticate against them as a service and we can simply use them but
00:12:05.680 as we are inside of all garden we might not have access to those services or we might not need those services for
00:12:11.920 authentication at all so what do we do we can simply create our own authentication
00:12:17.839 provider like oauth2 there are multiple oauth 2 providers out there and we can
00:12:22.880 just use one of them tweak them as per our necessary necessity now why war 2
00:12:29.440 or2 is pretty much becoming industry standard nowadays it gives you a very definite guidelines in terms of what you
00:12:35.440 need to do to actually achieve authentication across services it's very simple to implement rather if
00:12:41.440 you have a good net http client you can implement an oauth 2 client in matter of say 30 minutes there is a simple
00:12:48.399 sample plan available in one of our open source projects you can take a look at that if you want to
00:12:53.839 but what happens if you are using active resource as your client that's an interesting question because
00:13:00.079 active resource by default doesn't give you any access to response headers or request headers and oauth2 primarily
00:13:06.480 works on information passed through headers so essentially if we are not willing to monkey patch
00:13:12.160 the active resource active resources out of the door so we have to consider two things either we use active resource
00:13:18.000 open it up make sure that we have access to response headers and we can set all headers for every outbound request or we
00:13:24.639 can simply reconsider the active resource itself now as uh
00:13:30.560 while we are building all these services inside a walled garden we might have some
00:13:36.000 kind of external client the client is not externalizing developed by someone else but developed by us but accessed by
00:13:41.920 external user now if we are already an author provider we can essentially use any of the
00:13:47.920 external or2 providers and just get the authentication done like we can use google or author we can use facebook or
00:13:54.639 auth and let us authenticate against those and access our consumers
00:14:00.560 okay so next up right after authentication the next thing with an authorization uh this is pretty much uh a done deal in
00:14:07.279 rails if you're looking at standalone apps right user roles we have dozens of gems that do this and do it pretty well now the hard thing
00:14:13.920 is when roles have to happen over http that is authorization at a service level now there are several ways to look at
00:14:19.519 this problem you could potentially look at having centralized rules that is you have an authorization service which
00:14:26.160 contains all the roles the users and the rules that apply to them now the problem with this is it gets messy and
00:14:32.320 fragmented the whole point of having services so that services can evolve independently if you impose rules on
00:14:38.240 them from the outside you're already constraining their ability to evolve independently so what's the solution to
00:14:44.160 this what we recommend and what we've seen that works for us to have centralized roles but federated rules just have a
00:14:50.959 very clear set of responsibilities for your authorization service which is it simply maps users to roles and nothing
00:14:56.480 more right beyond that what those roles mean for particular resources in particular services is left to those
00:15:03.440 services then we have the next thing which is yeah so while talking about problems
00:15:10.560 with service oriented architecture i mentioned that there is chattiness between services now
00:15:16.160 let's just take an example assume we have this one client and two services which are sitting
00:15:22.000 which we are developing and there is one aux central lot server now client makes a request saying that hey i want this
00:15:28.639 information the first thing service has to do is validate without server whether requesting user is valid or not once it
00:15:36.000 has gotten confirmation from auth server that yes go ahead and process this request to complete the workflow service
00:15:41.920 one internally calls service to passing the user information now service one service two cannot depend on
00:15:48.000 the fact that service one has authenticated the user so it has to authenticate it again once it gets the go ahead from our
00:15:54.079 server it will actually generate the response send it back and then service one will process it further and send it
00:15:59.600 back to the client so this is going to grow the more services
00:16:04.880 you add to this this workflow your http graph is going to shoot like crazy and there is no really a civil silver
00:16:11.759 bullet to solve this problem because if you have been building independent services we are talking over http they are going to talk but there are certain
00:16:18.399 things which we can do to minimize the problem which we will face with chattiness of the application
00:16:24.720 first and foremost you want to make sure that your requests are fast so early in a project you want to set up some kind
00:16:31.279 of a performance build which makes sure that all the requests are satisfying
00:16:36.399 certain criteria they say for example take every get request should be less more or
00:16:42.959 less equal to 40 milliseconds it shouldn't grow beyond that once this particular performance build
00:16:48.480 in place you should monitor the trends how you while your services are growing and your applications are becoming more and more
00:16:54.720 complex how these response times are changing and what kind of tweaking you need to make sure that those response
00:16:59.920 times are brought down again now most of the requests you are making
00:17:05.120 across your app services are going to return smaller payloads and you need to make sure that you are
00:17:11.760 optimizing your app servers your web servers to handle these smaller payloads say for example apache versus nginx
00:17:17.039 which which one do you want to choose as your web server you need to consider that and obviously
00:17:23.839 you want to keep the payload small because if you try to return safe 50 mb of xml it is going to slow your entire
00:17:30.720 network down and potentially get down the performance of entire service suit
00:17:36.400 now while that is possible another thing you can
00:17:41.919 keep start introducing is caching http 3.4 not modified is the best thing
00:17:48.480 you can do you can implement server-side caching you can implement client-side caching
00:17:54.000 rails comes with fragment caching action caching if you have resources which need authorization and authentication you can
00:18:00.799 just action cache them and it will take care of before filters before actually returning a response
00:18:06.720 if you have some resources which don't need authentication you can just simply page cache them and your web server will take
00:18:13.679 care of directly serving them out of your public directory another thing you can do is
00:18:20.160 you can start using e-tags but the cache there is to use e-tags you
00:18:25.440 want a client which can support caching so essentially client sends a tag in a request and server response
00:18:32.559 saying that hey yeah this was not modified just load it from your cache and catch here is again if you're using
00:18:38.559 active resources as your client it neither supports uh caching at the client side nor does it allow you to
00:18:45.520 open the request header to put etag in every request so what can you do
00:18:50.960 there is this open source library which you can use which actually supports client-side caching and just if you slap
00:18:56.960 on top of it active model you can build your own client instead of active resource and use that instead of using
00:19:02.960 active resource only thing is obviously active resource comes with this whole plethora of functionality you might not
00:19:09.280 have all of them but you can introduce them as and when you need them
00:19:14.400 now we have covered the client side caching and server side caching but there is this whole caching which
00:19:19.600 happens in the middle over internet you might not need that if you are actually developing
00:19:25.039 a set of services which are local towards installation but if your services are actually spanning across the globe you
00:19:31.760 might want to consider some kind of caching in the middle say for example you might want to look at varnish or squid to cash your resources
00:19:39.039 in between now whenever we talk about caching you have to also take care of when and how
00:19:45.280 you expire the cash now server side caching so client-side caching is still there are like solutions to expire those
00:19:51.760 caches but what you might want to look at is if you are actually putting some cash in middle
00:19:57.360 how to expire square caches or varnish caches because otherwise that's going to come and hit you
00:20:02.400 so yeah that's about caching okay so next up is something that doesn't usually get a lot of attention when
00:20:07.919 we're talking about services which is pagination now let me just just jump in with an example and and i hope that'll
00:20:14.400 illustrate the default uh index action in rails i'm sure you're all familiar with is basically active record base all
00:20:20.720 this is what the template generates this means that when you're doing a search and looking something up for an api it's
00:20:26.159 going to return everything the database has now if you have 50k records that's pretty much what you get
00:20:32.480 so nation is very important and we do have a fairly nice solution for it uh
00:20:38.960 already which is well page in it now we'll page it runs off three pieces of page nation metadata which is which page
00:20:45.120 are we on how many records do we have in that page and how many records do we have in total now these three pieces of
00:20:50.240 metadata we need to start mapping them to http requests so that we can start page naming opiating over
00:20:56.559 apis so there are a few possible places or ways in which we can tackle this problem
00:21:02.640 the first one right off is http headers we can put these three pieces of metadata under the headers um it's a
00:21:08.000 little aesthetically displeasing because this is not really what headers are meant for and
00:21:13.440 it isn't really modeling the resource because the resource that you're modeling is really a collection so these are intrinsic parts of the resource
00:21:19.919 itself they're not metadata sitting in the headers you could put them into the xml tag attributes so the root tags carry these
00:21:27.200 three attributes page per page and total count and that's another way to look at it now again this is somewhat
00:21:33.039 displeasing because it isn't modeling the resource same as before second it depends on the fact
00:21:38.080 that you can have attributes in the first place what if you're using json and then of course there is what i was
00:21:43.360 talking about which is to actually model this as it should be modeled which is that you have a resource that is a collection if you're going full heat os
00:21:49.440 you you basically have these three attributes and then a bunch of uris but again we come back to active
00:21:55.039 resource which does not do this so if you're using active resource i mean stepping back if if you're using some
00:22:00.720 other client at school i would uh i would i would probably go with collection resources but if you're using active resources uh active resource what
00:22:07.520 you want is something that's a combination of active resource and build page name just like in normal rails app
00:22:12.640 we have active record and we'll page in it so what we did is pick option number two which is xml tag attributes because
00:22:19.440 we found that this was pretty easy to implement across the board and put that together to create
00:22:24.480 something called box page in it which you can find on github this is something that we wrote to allow active resource
00:22:29.840 to transparently pagenet so something you could potentially look at as a solution so on to the next area
00:22:37.039 yeah so let's talk about a slightly more complex problems which we might face while building services
00:22:43.280 and that is now we have caching at the server side caching at the client side but sometimes what we want is not the
00:22:49.280 cache responses but the cached objects themselves now let's take a look at this scenario for example we have this user
00:22:55.840 management service which actually manages multiple users under different companies and we have this project
00:23:02.000 management service which actually manages projects for a particular user now we get a new requirement saying that
00:23:09.120 as an admin user i want all projects which are for a given company
00:23:14.240 now there's a problem because in normal monolithic application wherein we have everything available in one
00:23:20.159 place we can just fire a database query and get the result out but if you see this model
00:23:26.400 we don't have we have company information here and we have project information in some other service so we have to make sure
00:23:31.919 that we actually somehow get that information from other services and create the join ourselves so what we are
00:23:37.919 essentially doing is we are actually making an http call to an external resource to get all the user ids
00:23:43.840 belonging to a particular company and then fire a database query to get all the projects which are in
00:23:50.799 where user id is and list of user ids now this very tribal exam explanation uh way to do it the
00:23:58.559 problem with that is if your user management service itself is
00:24:03.600 returning your paginated view of users in that case you have to fire multiple calls before you can get a
00:24:09.520 whole list of users which belong to a particular company now if that list of these users
00:24:15.679 is really large in that case when you're firing a database query at that time you'll hit probably hit the sql query
00:24:22.240 limit like it will go beyond 4 kb and you won't be able to fire that query at all so
00:24:27.440 how to solve this problem one way to solve this problem is essentially services share their data
00:24:33.840 with other services directly to the database but i seriously wouldn't recommend that what
00:24:40.320 happens if uh service one starts writing into the database of service too
00:24:45.600 essentially it defeats the whole purpose of going soft service oriented in the first place because now
00:24:51.760 everyone every service knows about every other service like internals other way to do it to prevent the actual
00:24:58.320 data writing we can probably give read on the connection like create a separate
00:25:03.600 database user and share it which only has read privileges on the schema or maybe create master slave database
00:25:09.840 configuration and create a read only slave copy now going by this approach definitely gives
00:25:17.039 you some benefits that first of all everything is immediately consistent so moment i update user in my user
00:25:22.400 management system and update my report which gives me all the projects in the company that report will automatically
00:25:28.960 get refreshed immediately without doing anything it is fairly easy to implement i just
00:25:34.559 have to drop another configuration in the database.tml configure my active record to talk to yet another database
00:25:40.799 connection and i'm kind of done but there are problems with this kind of approach
00:25:46.480 first of all we are integrating different services at the database level
00:25:51.600 so essentially we are breaking the encapsulation which otherwise service provides over its internal data and we
00:25:57.679 don't want to break the encapsulation what if there are computed fields which are present in the external
00:26:03.440 representation of a particular resource which is provided by service those computer computed fields won't be
00:26:09.360 available when you are directly hitting the database we are assuming that there is a one-to-one relationship between the
00:26:16.159 resources exposed by the api and the tables in the database that might not be
00:26:22.000 true essentially you are exposing internals of your one service to other service
00:26:27.600 which means that i cannot now independently evolve and modify my second service even though i'm
00:26:33.200 maintaining my apis properly so
00:26:38.400 case in point my user management system is currently using mysql as data store and i find out that it's really becoming
00:26:45.840 expensive and i want to shift shift to some kind of nosql database now i can no longer do it because there's some other
00:26:51.440 service which directly talks to my database and hints i'm kind of tied into that solution
00:26:56.640 so all in all try to avoid it it might work in certain scenarios but stay away from it if
00:27:03.120 possible so what is the other solution to implement the same thing before we talk about some other solution
00:27:09.039 let's talk about another problem we have which has similar suggestions
00:27:14.559 so something that comes up pretty uh often in most systems is an implementation of the observer pattern and when
00:27:20.640 an event occurs somewhere somebody else wants to know about it and you want to decouple these two so a trivial example
00:27:26.559 of this in the case of services is a post logout event right i have a user service my user logs out there are a
00:27:33.039 bunch of other services that want to know about this either to simply expire their sessions if they have one or maybe
00:27:39.120 they have some other things that trigger off that event now there are a couple of ways we can do this uh the most simple thing we can do is have a bunch of
00:27:45.279 callback uris right so my uh my client services register with the with the parent service saying that call
00:27:52.159 me back at this uri when a user logs out or when any other event happens and that's it when the event occurs the the
00:27:59.039 the main service will do the callback and as you can tell by my conversation you have callback hell it's really hard
00:28:04.960 to explain therefore it's probably hard to understand your configurations also get complex and then with all of these
00:28:10.799 callbacks your response times if you're doing this within the request response cycle trying to get really really long you could of
00:28:17.279 course solve this by having async callbacks right just throw background rb or something simple similar at it and have these callbacks done in the
00:28:22.960 background another way to look at this is potentially introduce an mq system like something like rabbit mq or
00:28:28.799 something like some similar tool this gives us a centralized bus which is an interesting difference from the
00:28:34.960 previous one in the previous case you had to know every single service that you were interested in and explicitly go
00:28:40.480 register yourself with that now here you just have a centralized bus and then you register listeners to that and then you
00:28:46.320 get all your events coming to you right there so you just have one authoritative source
00:28:51.360 it's async out of the box which is an interesting difference from the previous approach where you consciously have to make the differ make the decision to
00:28:57.440 make it async and you probably want to do some sort of convention over configuration with
00:29:02.559 regards to your topics or you know whatever names you use for your events
00:29:08.720 but otherwise these two approaches are broadly similar and you can pick and choose between them
00:29:14.559 but they're interesting because they lead up to a solution to what we were talking about that niranjan was talking
00:29:19.600 about earlier that i let him continue with yeah so uh
00:29:25.200 again going back to the problem we had and now going to the solution of that problem which is neater than actually
00:29:30.799 sharing the database across multiple services now assume that we actually set up some kind of event system which triggers the
00:29:37.919 event when resources are created updated or deleted and my second service which is
00:29:44.000 which actually wants a local cache of the first uh service which is actually triggering those events can listen to
00:29:50.559 those events and create resources in database in this way i'm not depending
00:29:55.840 on the database on the database integration but what i'm depending on is whatever
00:30:03.520 events which are triggered whatever payload is in that event i take that payload modified the way i want it for
00:30:09.279 my local cache and store it so that i can actually fire the joints which we could have fired in
00:30:15.120 monolithic application in this distributed services as well but
00:30:20.640 you have to keep one thing in mind that whatever you're storing in our local database is still a cache it's not
00:30:26.399 actual connection it doesn't have any actual connection with the resource which is sitting behind some other service so we should treat it like a
00:30:32.720 cache we shouldn't modify it locally because those changes are not going to get reflected anywhere else in the system and essentially you are just
00:30:42.159 like the reports you are generating are going to be bad if we are going with invented system
00:30:47.919 which is a synchronous another thing you need to keep in mind is whatever local cache we have
00:30:53.360 might not reflect in real time the actual data which is present behind some other service
00:30:58.399 and as it is not real time you need to pick and choose all those resources which you want to cache locally
00:31:04.480 keeping this in mind that whatever i'm getting there might be some lag
00:31:09.519 in the reflection now whenever we are caching anything we have to talk about how we are going to expire
00:31:15.679 cache so obviously if i'm listening to creation of users in user management system i also have to listen
00:31:22.559 to things like update event and delete event and accordingly modify my local
00:31:28.320 cache so that it actually reflects what is there on the server and obviously the server has to trigger those events with
00:31:35.279 payload which allows consumer to modify their local caches
00:31:43.679 so this is something that we again mentioned earlier api versioning unfortunately there doesn't seem to be
00:31:50.240 any um you know holy grail here like the pure rest guys talk about uh apis that
00:31:55.919 practically never need versioning but in in practice on the internet we see uris with b1 v2 v3 so on and so forth now
00:32:03.360 there are a few ways to deal with this uh with this uh with this problem the
00:32:08.559 inside of a walled garden uh and a few guidelines one thing is to just ensure that your resource representations
00:32:14.799 remain as far as possible backward compatible and that when you're changing them or especially when you're removing
00:32:20.000 fields from your resources you're notifying your clients which you can do because you know they're all in the same
00:32:25.840 organization as you are you may also want to consider having a
00:32:31.200 front-end server which routes to different versions of the application entirely so
00:32:36.399 you actually let version one run for a while when version 1.1 is out there so that clients have time even though
00:32:42.799 they're within your organization they have time to migrate and change and uh yeah that's pretty much it just a little
00:32:48.480 bit of common sense a little bit of communication really within the organization can work wonders can remove api versioning as a pain point
00:32:56.559 now the transactionality of various calls we are making specifically for those
00:33:02.240 workflows which are spanning across multiple services even in distributed databases it is hard
00:33:07.840 to achieve there is something called as two-phase command two-phase commit and databases actually implement two-phase
00:33:13.440 commit depending on which databases database you are using to achieve transactionality across distributed
00:33:19.760 databases it is even harder when we are talking over http and honestly speaking
00:33:25.360 though we have seen one such implementation working we haven't worked on it and that would be talking itself
00:33:31.360 like 40 45 minutes talk so i'm just going to take it offline and if anyone is interested we can talk about it but
00:33:37.519 yeah i'm going to keep that aside for time being okay and then we have the one of the
00:33:43.440 hardest things to deal with in these things which is just the core engineering the software engineering aspects of the whole uh
00:33:49.519 the whole development process uh one of one of the this is something that i picked up of one of my projects which is
00:33:56.559 standardized like crazy right standardize everything if you have any tooling any libraries that are part of
00:34:03.519 one service make sure that all services use them versions of libraries should be the same as far as possible ensure that
00:34:09.919 every service every code base uses the same tool same versions across the board if you're doing an upgrade calculate the
00:34:16.079 cost and upgrade across the board because engineers working on different code bases shouldn't have any surprising
00:34:21.359 things to deal with or different ways of implementing the exact same things in different services within an
00:34:26.399 organization if you have to deal with that can be quite a pain treat shared code between services like
00:34:32.320 any library like you may want to go as far as to spin up your own gem server create gems that are purely local not
00:34:37.760 published out there and have uh have bundler use those gems like go that far
00:34:43.200 because this is an important aspect of standardizing shared code between all the different services
00:34:49.919 another smell to keep an eye out for is conflation of human consumption of apis
00:34:55.119 with machine consumptions that is you know are you really producing an api or are you producing html so this kind of
00:35:01.200 code i'm not sure how clear this is is a smell if you see an action that um that
00:35:06.560 has both format xml or format json or something similar with format html you know in the same action this is a smell
00:35:12.640 watch out for this uh as far as possible you want to try to segment your actions that so that they either deal purely
00:35:19.280 with machine consumable apis or with people the two don't fit well together there is a mismatch there
00:35:24.880 there is a whole spectrum of separation that you can create starting from simply saying all right i'm going to make sure
00:35:30.800 that some controllers deal purely with apis and some controllers purely deal with people two all the way to the other
00:35:36.800 extreme which is to say i will build entire applications that have nothing but apis and no html at all and i will
00:35:43.040 build a separate application that is for people and that will act as a front end to this one the the twitter the new
00:35:48.320 twitter ui is probably a good example of something like this even though it's it's for public consumption
00:35:53.440 configuration management can become something of a nightmare across multiple services so don't commit configuration
00:35:58.880 into version control use something like chef or puppet to manage it so if you're looking at multiple servers
00:36:05.119 like if you have like 10 services you're probably looking at a minimum of 30 or 40 servers deal with it using chef and
00:36:10.960 puppet standardize your configurations across the board d you know have the same uh you know web servers same you
00:36:16.560 know same caches same everything so yeah that's pretty much our talk that's it
00:36:22.320 and uh we're open for questions and answers
00:36:41.760 awesome i'll check that out anybody else any questions
00:36:47.200 sure
00:37:00.560 uh we we actually started off with a system that was monolithic and already handled a certain volume of transactions
00:37:06.640 so we were looking at i don't remember was it 30 000 requests per hour something like in that range and that was only two so we were we were
00:37:13.440 a team that was working on one service so our service got 30 000 requests uh there were other services that probably
00:37:19.440 got a lot more sure
00:37:36.640 yes
00:37:51.520 right so that's exactly what i was saying earlier that uh you know most of what we're saying applies to everything
00:37:56.800 in the in the ruby ecosystem so my example really was in contrast with not not rails versus sinatra but it was more
00:38:03.520 rails versus spring or cake php or django or something like that
00:38:13.440 what about
00:38:18.480 sure so uh we didn't really address that but uh one one of the things that you want to look at is unless you have your
00:38:25.040 system actually go down uh you want to have that performance build monitoring use monitoring your
00:38:30.560 entire uh you call graph if that's the right right phrase to use so that at any
00:38:35.599 given point in time you don't have timeouts now i was on on on one story that that looked at exactly this where
00:38:42.720 one request panned out into about 200 requests and took about i don't know it was like three minutes or something crazy like that and you just don't want
00:38:49.200 you don't you don't want to get there right the simple solution to that is don't get there like track these things
00:38:54.720 up front because there is no good answer the other thing to do is to go async and async is a whole different bargain if
00:39:00.800 you're looking at an async system where your request response cycle has a bunch of stuff happening in the background that's a whole different ballgame from
00:39:06.400 this so this this pretty much says that okay if you're going synchronous from request to response just manage your
00:39:12.240 time so that your entire request response cycle happens in a same manner now we know this is possible because i
00:39:17.599 think one one article i read said that every amazon page that in in the store
00:39:23.040 has 135 services behind it or something similar so we know it's doable i haven't done it
00:39:28.960 for 135 i've done it for a dozen so yeah no good answer just track those trends
00:39:34.079 and make sure that it isn't going wild sure
00:39:53.839 sure i'm not sure people at the back got that the question was uh why you know what's the reasoning behind uh going to
00:39:59.359 oauth as opposed to something like ssh tunneling or any other solution that we could come up with you want to take this
00:40:05.200 one um yeah so essentially as i said like oauth is pretty much standard and it's easy to
00:40:10.319 implement okay and it gives you standard set of guidelines that hey put this in your
00:40:15.520 http headers and here you're dealing with http stack you're not going for your authentication and authorization
00:40:21.920 beyond http stack so if you actually want to send a request okay now imagine that you have
00:40:29.359 some servers deployed on ec2 some are in rackspace cloud everything is under your control
00:40:35.520 but you want to maintain them at separate places just for maintaining the redundancy and making sure nothing goes down and everything is behind the
00:40:42.240 firewall how we are going to open sss tunnels across different services
00:40:47.440 so there can be a bunch of such problems which will crop in again do you want to have ssh tunnel ssh access to your
00:40:53.200 production box or not is the first question so the other angle like my
00:40:58.560 like the other response i would have is that again it comes back to standardization like or2 is understood respected and
00:41:05.599 it isn't surprising and whatever performance losses you may have
00:41:10.640 using or 2 clearly there are a bunch of businesses that are working despite
00:41:15.680 those so it is potentially doable it's just a question of implementing it in the right way and that again is just
00:41:21.920 very very implementation specific it depends on your setup so on that side do i have a good formula to say can you
00:41:28.800 do this and make sure your oauth 2 is going to be really fast no i don't but you're just going to have to profile
00:41:34.720 it and make sure that you're doing certain things right um anybody else
00:41:42.000 yeah i see one hand there
00:41:47.119 approach that we um sure sure we just added a representation to
00:41:54.160 all of our resources
00:42:04.160 right that's that's pretty interesting so you you basically have uh a subset of information plus a pointer to whatever
00:42:11.599 else somebody might need
00:42:16.960 right so somebody wants it they go ask for it that makes a lot of sense okay great anybody any other questions
00:42:26.640 no great awesome so thank you everyone thank you very much please
00:43:05.920 do
00:43:12.800 you
Explore all talks recorded at RubyConf 2011
+55