Summarized using AI

Advanced Eventmachine

Jonathan Weiss • September 29, 2011 • New Orleans, Louisiana • Talk

Summary of Advanced EventMachine

In this talk presented at RubyConf 2011, Jonathan Weiss explores advanced topics in EventMachine, a powerful event processing library for Ruby. He shares insights from real-world applications where EventMachine is extensively used, particularly in the context of deploying and scaling applications in a production environment.

Key Points Discussed:

  • Introduction to EventMachine:

    Weiss compares EventMachine with Node.js, highlighting its non-blocking I/O capabilities that allow multiple I/O streams to be handled simultaneously through callbacks, making it a popular choice for applications requiring high concurrency.

  • EventMachine's Use Cases:

    One significant use case presented is handling a large volume of log files and uploading them to Amazon S3. Weiss illustrates the inefficiency of sequentially uploading files and contrasts it with the efficiency achieved using EventMachine, which allows multiple files to be uploaded concurrently.

  • Event Loop Mechanics:

    Understanding the event loop is crucial when using EventMachine. Weiss emphasizes that blocking the event loop, for instance by including synchronous calls, leads to performance degradation. He stresses that all I/O operations should be non-blocking to maintain responsiveness in the application.

  • Handling Long-Running Operations:

    The talk covers strategies for managing lengthy operations without blocking the event loop, advocating for the use of constructs like em.next_tick to break work into smaller segments, or em.defer for offloading long computations to background threads.

  • Deferrables and Structuring Code:

    Weiss discusses the use of deferrables in EventMachine, which provide a way to manage complex callback scenarios. This mechanism allows callbacks to be registered in a more organized and cleaner manner, thereby enhancing maintainability.

  • Error Handling and Testing:

    The speaker touches on the challenges of testing applications built with EventMachine, suggesting strategies for isolating domain logic and integration tests that simulate event behaviors to verify expected outcomes. He provides insights into handling exceptions effectively to prevent crashes in the event loop.

  • Optimization Techniques:

    The use of queues, channels, and iterators for managing tasks and inter-thread communication is highlighted as a way to keep the code organized and effective. These structures help streamline code execution across different threads.

Conclusion:

Weiss concludes that while EventMachine offers powerful capabilities, developers must remain vigilant to avoid blocking the event loop and ensure that all operations are efficiently structured. The potential of EventMachine can lead to significant performance improvements when applied thoughtfully in scalable architectures.


Advanced Eventmachine
Jonathan Weiss • New Orleans, Louisiana • Talk

Date: September 29, 2011
Published: December 12, 2011
Announced: unknown

We use EventMachine heavily in production. It is handling uploads to S3, managing thousands of messages a second or distributing agent workload. This taught us a load about EventMachine and some weired corner-cases. I want to about such advanced EventMachine topics and shared some use-cases and experiences from the trenches.

RubyConf 2011

00:00:17.199 so welcome to my talk about advanced event machine um i'm from from berlin germany we do a lot
00:00:24.320 of work around deployment and scaling because we work on scalarium which is um
00:00:30.720 an amazon ec2 cluster management solution where we handle lots of lots of connections and manage
00:00:37.120 uh thousands of machines and this is like the way we discovered event machine and and
00:00:42.480 started to use it and the idea of the talk is to to to describe a couple of patterns we
00:00:47.680 came across in a couple of situations that you should aware of um if you use event machine
00:00:53.199 um before we dig into event machine um who already uses event machine
00:00:58.399 oh so quite a few good so i have to i can keep the introduction
00:01:04.320 uh pretty short um yeah so event machine as it seems a lot of you
00:01:09.600 guys already know is a very simple and and event processing library for ruby um
00:01:16.720 kind of uh yeah the node.js for for ruby if you want to so the idea is that i can
00:01:21.920 have a lot of multiple io streams without blocking on them so that i can execute them what looks like in parallel
00:01:29.759 and and whole programming of your machine is done via callbacks so very similar to node.js and the cool
00:01:36.079 thing about event machine is that it already brings a lot of a lot of protocol implementations for
00:01:42.640 the typical stuff like udp tcp http smtp and so on so it's very simple
00:01:48.159 to use and because it's ruby it's very easy to extend and to reuse your existing libraries which is also
00:01:54.479 kind of the culprit of it because it's very easy to uh to break it if you use existing libraries it's very
00:02:00.479 tempting as we'll see in a couple of minutes to to just require a gem and without knowing it every everything
00:02:07.680 gets slow and a lot of unexpected things happen yeah um so this is what you should definitely
00:02:14.480 have a look at um you have to think before you use or require a new library um so the the basic idea is turning like
00:02:23.040 synchronous step-by-step sequential code like this one into um callback a
00:02:29.680 callback oriented code where i um i execute i call
00:02:35.040 command and register callback that will be called once the the command is finished
00:02:40.480 and my code can move on so it doesn't block on the load data call and i get notified once
00:02:46.640 the the process is finished and this this idea is pretty pretty simple i mean every
00:02:52.160 people especially in the the race community know it like if you program javascript uh in the browser with with uh like dom
00:02:58.800 um notifications and stuff like that so it's it's nothing too unusual um but it's pretty pretty uh
00:03:05.599 it's it's pretty easy to to get confused once you have a very deep nesting of callbacks um but the the the simple ideas is very
00:03:12.879 simple so let's maybe like step back and explain to you how we came to across event
00:03:19.040 machine and why we started to using it our simple use case was we
00:03:24.400 process a lot of events and all of those events we generate log files and we store them on amazon s3 so the basic idea is somehow
00:03:31.680 we generate a log file and we have millions of those and we then want to upload those to s3 so
00:03:36.799 the typical the typical step would be just a sequential call you have a log file you post it to amazon s3
00:03:43.680 you wait while you're uploading you weld while amazon is processing the the data and then you get a response back that
00:03:49.760 says i've got the file so nothing nothing's fancy here but if you if you have a lot of log files and
00:03:56.239 you and you do it with a lot of files it looks like this right so i upload a file i weld i wait for the
00:04:02.319 upload i work for the processing it's done okay let's take the next one yeah upload the next file waiting uh i get
00:04:09.840 the response so i can upload the next one and so on so the time it takes is is the sum of the individual times yeah so it takes me a
00:04:16.720 long time and for most of the time my process is just sitting there waiting for networking waiting for amazon to
00:04:22.240 to respond if you do the stuff like this with event machine um the the picture is a very different
00:04:28.960 one because i can now upload multiple ones at at once yeah i can maybe then get the first response back i can
00:04:35.040 upload the next one and then i get they get the responses for all the other ones back so the the time that that i'm spending is
00:04:41.919 is like roughly um like the the maximum number for the individual like the long i have to wait for the longest request
00:04:48.160 yeah it's it's only roughly because it depends like on how how many parallel processors you're doing it
00:04:53.280 and on other things but you can you can optimize this a lot with stuff like that so for
00:04:58.560 for example in our case we had like i think a six times uh
00:05:04.880 increasing throughput of of managing all of those log files just by doing it with event machine
00:05:11.520 so this is uh the theory behind it um we also wrote a small uh library to do it so um this is like the code if you would do
00:05:18.720 it uh first was uh rate i will write ws and then is happening that is a small
00:05:24.720 library that we wrote um and as you can see you have a little bit more boilerplate code in order to
00:05:30.000 get event machine running so we have em run and then at the end we have em stop but the actual code is not so much
00:05:35.680 different apart from the the callback part where i wait for the response to to come back to
00:05:40.960 me if you're interested in happening it's as i said a small library um
00:05:47.600 if so apparently the important thing about about event machine is the event loop right uh
00:05:53.199 it's in the name um and this is was for me the most important thing in order to
00:05:59.120 to really understand how event machine works and where the culprits are is to really understand how the event loop
00:06:04.560 and the the corresponding threads are working so the setting up
00:06:09.919 the event loop is pretty easy you just call em.run and at some point you stop it again
00:06:15.520 and em run is is an endless loop it will start the the event machine loop
00:06:21.520 and it will um not return unless you stop it um so so if you have like small parts small
00:06:27.919 agents small demons that are running event machines usually this is like the first and the last call that you're
00:06:33.919 doing yeah you're calling event machine and inside the loop you're setting up your your callbacks and then you do nothing
00:06:39.360 and wait for event machine to process all events and come back to you um what you don't see is what what
00:06:45.919 happens in the background is so your quote is actually a very small part of what is what is happening
00:06:51.120 um but your code can break anything else and and this is uh unfortunately the the thing that you
00:06:56.560 always have to have in mind if you if you program with event machine is you have to be very careful what your code does and where it does it because
00:07:02.479 it's very easy to break um to break the loop so what you do is your code your register
00:07:09.120 your callbacks and during such a loop run what event machine is doing is it's running all timers that
00:07:15.759 you registered it's uh checking uh file descriptors it's uh checking sockets networking stuff like that
00:07:21.120 and then it calls your code if necessary and then the next iteration starts so the whole idea is this loop is
00:07:27.919 endlessly running and always um and always in this line so checking file descriptors checking networking
00:07:33.680 your code if any callbacks that you registered and next round and um what what you have to keep in
00:07:40.720 mind is that you you you should not um disturb this this uh cycle and unfortunately it's
00:07:48.000 very easy to do so but if you disturb it if you somehow break it by by your code being too slow
00:07:53.759 everything falls apart so um how can you break it and what you
00:07:59.280 shouldn't do is stuff like that yeah putting in the in the main loop like a sleep yeah of course you nobody
00:08:04.800 puts a sleep in there but uh doing any synchronous calls is pretty much the same thing
00:08:10.960 um we're now blocking the main event loop because we're waiting for an http call and if this if this server that we're
00:08:17.759 getting the the response from is very slow everything is waiting on on this um on the server so so you shouldn't do
00:08:26.479 any synchronous calls inside the event loop
00:08:31.919 and this is the part where i said in the beginning it's very easy to break it even by not by not knowing because if
00:08:38.399 you if you like you have a class that is running somewhere inside the event loop and you just need a small new
00:08:43.599 functionality and you just require a gem somewhere that does a couple of http calls that
00:08:48.880 uh does some networking whatever it's very easy to to like for this code to sneak into the event
00:08:54.480 loop and then you're wondering why is everything processing so slow why are the messages piling up um because you just yeah you blocked the
00:09:01.360 loop so um an example case how we did it for example is
00:09:06.800 we use nanite which is an agent framework on top of mqp event machine and rabbitmq
00:09:12.880 and what it does is it distributes messages across agents and we pretty much
00:09:20.080 broke like the whole setup by just having the the default um behavior of the ampu mqp game which is if there
00:09:26.880 is data on the socket it reads it and if you have a lot of messages a lot of data it reads a lot which means
00:09:32.480 um if you look back to the to the loop if you're if you're if the reading part becomes too big
00:09:37.600 because too much work you're slowing down the whole loop so so our code was essentially like reading
00:09:43.839 thousands of messages then we had one iteration which means like callbacks were notified once and then we're again reading thousands
00:09:49.920 of messages which means we slow down the whole cycle which means effectively everything is is
00:09:56.480 running to a halt um and we did it by just like using the
00:10:01.519 mqp game which is already evented and stuff like that so it's it's very easy to do it so what what we need to do
00:10:06.560 is to somehow split the work and make sure that our code is not running for to uh for too long
00:10:14.240 um and you can you can um how you can do that we'll show in a minute but the main
00:10:19.839 idea is that inside the main reactor loop which is the the main thread you shouldn't do
00:10:25.680 anything that that is slow you shouldn't do anything that could potentially block it you should also if you're not calling
00:10:31.839 any synchronous code only asymptotes code you still shouldn't do anything that takes too long because
00:10:37.279 as i just saw in our case we didn't do anything synchronously but because it was so much work we still
00:10:42.640 blocked the loop um so the main idea is to try to keep everything as short as possible also to do all i o handling
00:10:48.480 inside uh the main loop and uh avoid yeah avoid non-evented libraries
00:10:54.880 um i can i cannot stretch this point too much because it's so easy to just
00:11:00.160 require a library this is also like why i think node.js is like more successful in
00:11:05.600 inventors applications it's not because they have like a better programming model better libraries whatever it's just because
00:11:11.519 they unfortunately that everything is evented so there is no like synchronous library that you can load
00:11:17.760 while on ruby if you just like do uh like a file.open um operation yeah you just use the the basic libraries
00:11:24.399 that are part of the of the standard library um you're already using some
00:11:29.519 something synchronous that is blocking you um so yeah take care and not do not use anything
00:11:35.279 that is uh not evented and the other thing is as i said the um
00:11:41.440 your core bits are running as long as it as they take so make sure your callbacks are not taking too long
00:11:46.880 um because even even if you use evented callbacks or event applications and libraries inside your
00:11:52.320 your callbacks if they take too long you're still blocking everything and one nice way to do it how you can
00:11:59.440 break up those in smaller steps of work is to use em next tick
00:12:04.880 so em next tick um you give it a block and what it does is it schedules it to run in the next iteration of the
00:12:11.600 of the loop inside the main the main reactor thread and what you can do with it is for once
00:12:17.200 you can like come back from a background thread or the other thing is you can if you have like a very big piece of work that you need to be doing
00:12:23.680 like reading thousands of messages on a socket you can just like read a couple of messages and then
00:12:29.200 reschedule yourself to run in the next iteration this is an example how you can do it
00:12:34.639 with with a simple proc and a method so let's assume do something is potentially slow or like potentially
00:12:41.200 processes a lot of a lot of work so we can just re just call it um every couple of um only process a couple of messages and
00:12:48.560 then reschedule ourselves to call to be run inside the next iteration um
00:12:54.000 yeah so this is what em next tick does the other thing um is it brings you back from a background thread
00:13:00.880 and what it means um we will have we'll see in a second but um this is like one
00:13:06.639 of the of the important concept is every time you use i o like wrap it in next tick
00:13:12.560 this will ensure that that you run inside the main loop inside the main the reactor thread
00:13:19.040 there are cases where you don't want to you do this or there are cases where you want to to have to do something slowly or you know
00:13:24.880 i have to compute like the fibonacci number of 2 billion or something and it will be slow so
00:13:31.760 what you should do is wrap it in an emd fur block emd4 does the opposite of am next stick
00:13:37.519 so it brings you out of the main reactor thread into a background thread by default there are 20 threads
00:13:43.199 in a thread pool that you can use for it and emt fur will make sure that this code that you give it
00:13:48.320 runs in such a background thread and so in our example case here we will
00:13:54.399 not block the the main loop if i wouldn't wrap wrap it in emd fair we will never
00:14:00.000 see like the the the continuous printing of the of the timer because we will be blocking
00:14:05.920 the main loop because the sleep will block the loop and in this case we've wrapped it in emd fur so now
00:14:11.120 um we're blocking a background thread which means the main loop can still run
00:14:16.720 so what emd4 is good for is if you have long running computations if you have a long running process
00:14:22.320 if you do something that is potentially blocking the main loop wrap it in the emd fair block the only
00:14:28.480 thing that you have to remember is if you're doing i o in it the io has again to be rescheduled in the main in the main thread
00:14:35.199 so uh sometimes your code will will like ping-pong between uh deferred blocks and next ticks because you you're doing computation in
00:14:41.839 a deferred block then you you then have it have having some i o so
00:14:46.880 you do it in the next tick block and then going back again with a result to a background thread
00:14:58.000 and if i would stop here um those are pretty much the basics of event machine
00:15:03.120 next tick deferring and then the typical callback stuff so this is like the the simplest
00:15:08.959 thing that you can do with event machine but um also like for me what i found myself is
00:15:14.000 like the hardest part is finding out where your code is running and
00:15:19.199 where it should be running yeah sometimes if you have um as i said a library a class that you're requiring somewhere and calling
00:15:25.760 out of event machine loop it's hard to find out now am i now in the main loop am i in a background thread where should it be
00:15:31.040 running and it took me a while to like really get the feeling where i should be going and so the main thing for you guys to remember
00:15:37.440 is i o in the main in the main uh reactor threat anything that is potentially slow in a deferred
00:15:43.920 uh threat and if you are unsure wrap it in a defer on a next stick call
00:15:51.920 but there are a couple of um syntactic sugar a couple of helper
00:15:57.040 classes that makes it a lot of easier to to not get tangled in such a like a callback code so one of a couple of those
00:16:03.759 are uh deferable cues um channels and iterators that i want to talk about and they help you to structure code so
00:16:11.600 that you don't get tangled up in all of those code and it's very easy to make sure that you're calling the right code from the right place
00:16:18.880 so the first one is deferrable deferrable um at first it looks a little bit
00:16:24.880 complicated and it's it's hard to make out why you need it but it allows you to write your your
00:16:30.000 libraries in a way that that they feel like an event machine library and that makes them very very easy to
00:16:36.560 use what they are basically is like a mixture between a state machine
00:16:42.320 that that and uh like a simple concurrency mechanism that allows you your own code to register callbacks and execute them when
00:16:48.959 it's ready so the idea is that i can register callbacks again
00:16:54.639 just like on the built-in event machine classes so for example i can register callbacks
00:17:01.120 on the success case and i can register callbacks on an error case and
00:17:06.319 um the cool thing is that callbacks can be edited at any time yes i can register i currently create an
00:17:12.480 object where i can register callbacks for successful error cases it doesn't sound too interesting so let's yeah look at an
00:17:18.559 example this is a very not too useful example so i have a class i include the deferrable
00:17:24.000 module and then i create an instance for of it inside the run inside the m loop
00:17:29.520 and i i just adds a couple of callbacks yeah a callback for success a callback for error case
00:17:35.200 and then after five seconds i uh make this deferrable object succeed
00:17:41.360 so that my successful callback will be called so those are like the build the basic primitives that the deferrable
00:17:47.039 allows you to do so the interesting question is why do i need stuff like that yeah and the answer is to to um to build
00:17:55.200 classes where you that use again um internally callbacks in order for your user to make
00:18:01.120 it very easy to use them themselves so um let's assume i have
00:18:06.880 i have using the google spellchecker api and because i of course use the
00:18:13.039 asynchronous http library to use it i i would in you if i would my using
00:18:18.080 myself i would like to to um to do a call to the to to google and then register my
00:18:24.080 callback to be running and then i pause the result and stuff like that so this class is abstracting all this logic away from us and using
00:18:31.039 deferrables i can use it like a built-in event machine library so i can i can instantiate it i can then call the
00:18:38.000 method that is doing the work as currency in our case check and then register callbacks that should be running once i get the result back
00:18:44.559 so my callback is getting all the suggestions and then printing them out so very very boring but the code feels right the
00:18:50.960 code looks like it should be looking in event machine case and it's implemented using deferrables
00:18:56.400 so the the implementation is also very simple i have my class that
00:19:02.880 includes deferable then i have the the check method that does the actual asychronous call to
00:19:08.080 to the http um library of the http service of google and um the interesting part is
00:19:15.919 that this that on a successful http call i called succeed on myself yeah i called succeed on the on the google spreadsheet
00:19:22.720 class with the the the result that i got from the from the google servers and by doing so
00:19:29.840 all callbacks that um that the user registered on this google spellchecker instance will be
00:19:35.520 called with the with the result
00:19:40.799 and the same would be true for the error case and so on but this allows me to um so if my class uses like
00:19:47.679 very complex nested um structure of event machine calls of like a callback of callback of callback
00:19:54.320 i can i can still make it look like very nice and just like notify the user when i'm ready
00:19:59.840 without the user having to keep track of um all the all my internal state and to know where where where i'm in
00:20:06.080 like did i parse the xml already or not uh and i'm still asynchronously of
00:20:11.120 course so um i hope that you understood what like defer deferable is all about
00:20:17.840 so deferrables are just like syntactic sugar that makes it very easy for you for for you to use a class
00:20:24.240 that that uses asynchronous methods like in this case an http request
00:20:30.880 again so this is the the code that uses it so for um so as a user of this class i can
00:20:36.080 just register a callback myself i can just say if the once you get back the suggest the
00:20:41.600 suggestions from the google spreadsheet api this is what i would do with them and i don't have to care how it's implemented
00:20:46.840 internally so deferrables are also heavily used by by
00:20:52.480 event machine itself and by other libraries so for example the em http request library that
00:20:58.240 we're using right here is returning your a deferable so the http variable is a deferrable um actually
00:21:06.240 so yeah most of the vendor machine libraries and protocol implementations actually return deferrables and this is what's
00:21:12.240 it's nice about them is that you can just like put register callbacks on them
00:21:17.600 and be called when they're ready uh yeah another another couple of things on
00:21:23.039 deferables is they have timeouts so you can say you can give a deferable timeout if it's
00:21:29.440 not succeeding within a specified amount of seconds it will automatically fail and you can also reset them which um is
00:21:36.080 a nice a nice um way to to to like schedule work so for example
00:21:41.760 um the the em jack library which is the beanstalk um implementation of an event machine
00:21:49.039 client they use deferables in order for you to be able to all to like instantly use the class without even
00:21:55.600 having a connection established because you're just registering callbacks on the class and
00:22:00.640 once the connection is registered all callbacks will fire so the commands will be executed you get the result back
00:22:06.000 and if the connection should like break or fail or time out they reset the deferrable which means
00:22:12.400 all callbacks are cleared again and then you can again register ones of them and once the connection comes back uh
00:22:18.080 all commands will be flushed to the to the server so for you as a user um the
00:22:23.200 api is very simple very nice but the the um the behavior is very complex
00:22:28.240 because you're stacking up commands you you're um firing them once you get the connection back you're clearing them
00:22:33.520 when the connection is lost and stuff like that and it's very easy to implement those with the variables so
00:22:39.360 so if you're having a more a complex class that is doing a lot of um callbacks behavior with with event
00:22:45.440 machine you should definitely um think about um using deferables in order to make the implementation a lot
00:22:50.720 nicer another thing that helps you structure your code if you if you're coding with
00:22:56.080 event machine are cues so um cues are exactly that what the name says a very simple
00:23:01.440 cue which is like in memory so inside your process um so and but the the the um the nice
00:23:08.559 thing about them is that they help you to to make sure that the code runs in the correct place
00:23:14.159 so for example um it helps you to um to be sure that you're running in the in the reactor thread so everybody can
00:23:20.960 push the queue and you can pop up it and you know the the popping part will always be in the main reactor thread and
00:23:26.240 can respond to to the messages um the queue is very simple
00:23:31.520 so you just instantiate it you can register a callback that will be run on pop and then you can push messages onto
00:23:37.919 it the important part to know is that the pop part will only be run once so if you if you want to have some kind of a
00:23:44.559 worker pattern where you continuously like execute something on items on the cube
00:23:50.640 you need um like to reschedule yourself just unlike in the example we we saw before where you're just having a proc and at
00:23:57.440 the end you're just um executing the proc again so in this case we will be now
00:24:02.840 continuously working off messages of the queue and the the um the pop part will be
00:24:08.880 inside the main loop uh inside the reactor thread so you don't have again to worry as i said
00:24:14.320 before you because you usually have to worry i am running it first or in a director thread where should
00:24:19.600 i be running and stuff like that so with it with a queue it's very simple so that you can schedule work from from
00:24:26.400 multiple threads don't have to care where those threads are and then work them inside the main reactor loop
00:24:38.960 another thing that helps you to structure your code are channels so channels are yeah a simple pub sub
00:24:45.440 implementation again in memory inside event machine so they are not a replacement um um for for like a real messaging um
00:24:53.520 uh application like uh mqp and rabbit and q and whatever yeah they're just in process but they help you again to
00:24:59.440 structure your code they they make sure that that um subscribers and publishers
00:25:05.919 can can like safely communicate um cross thread with with each other so the the idea is again
00:25:12.720 um i have a channel and i can just subscribe to it so i can subscribe for
00:25:18.080 multiple threads so i can subscribe out of the main thread i can subscribe out of a deferred
00:25:23.279 thread and then i can of course can push messages onto it
00:25:28.799 and as you would expect i can once i subscribe to a channel i get the messages and
00:25:36.559 and can unsubscribe again the the um the way you usually use um channels is
00:25:42.240 if you want to um have a long running deferred thread that is doing um like work that takes a
00:25:48.480 long time and you want to to be able to like communicate your your intermediate results
00:25:53.840 to the main reactor thread and do something with with them so usually um you would then just just
00:26:00.159 publish out of the of the deferred long-running thread you would publish to the queue to the channel and then you would have
00:26:07.200 inside your reactor loop you would you would respond to the messages and process them
00:26:14.240 another very useful thing that you can use is the iterator so em iterator allows you to to do
00:26:22.559 to iterate over set in parallel
00:26:28.000 so if you if you use a normal the built in iterator and ruby so let's assume we have we have an array
00:26:33.279 of a couple of urls and want to find out what is the biggest site um hiding
00:26:39.279 behind those urls yeah so i i'm for ev for every url i uh load the page and i count
00:26:44.880 uh like the the the byte size of it or the the the size of this of the of the
00:26:50.320 string of the body uh in a in a sequential implementation i would have to like
00:26:55.440 call ever sequentially every url wait for it and then at the end i could i could uh like find out which is the
00:27:01.440 the biggest um if you if you want to do this um we
00:27:06.640 already learned that with event machine it's a lot easier to do this in parallel but still you have a lot of coordination work to do yeah you have to keep track
00:27:12.480 of of um the number of parallel threads you have to to somehow store the intermediate results and then at the end
00:27:19.279 iterate over those and this is exactly like the shortcut that em iterator gives you
00:27:24.559 so em iterator um allows you is a built-in like with integra sugar of
00:27:29.679 wrapper to do so what you you give it a range and you give it the number of concurrent workers that issue that you want to use and and then a proc
00:27:38.080 that will work on every on the individual item in the range and then
00:27:43.520 optionally you can give it a second proc that will work on the on the result once
00:27:48.640 all results are there so the the same example um with the iterator looks like this i give
00:27:55.840 it the the the array with the urls i'd say that i want to be processed in 10 parallel workers and
00:28:01.760 then i give it a proc that that specifies what do you do with the individual item in the set so in our case we're doing the async
00:28:08.320 http call and the important part is that we have to manually signal event machine that
00:28:13.600 we're done processing this part so this is the the iterator dot return of the result
00:28:18.960 um because event machine doesn't know when are we finished so our callback has to to signal that we finish processing this
00:28:25.600 item and the second proc gets all the responses once they're ready so i don't block for i don't wait for
00:28:32.159 all responses i can just register a callback again and don't have to to do all the coordination work
00:28:37.360 myself like like coordinate workers instantiate the the the deferred threats like store somewhere the intermediate
00:28:43.440 results and then uh then then to like recheck are you ready with all with all of my urls or not
00:28:50.080 so yeah so em iterator helps you to to um to work so a lot of uh over range or
00:28:56.159 like of a set of items in parallel
00:29:02.240 once you work um with event machine for some time you can you you start to uh you should
00:29:08.159 definitely to start to look into fibers and em secronay so um
00:29:13.760 once you understand all of it and you start to write complex applications the problem um is that you start um
00:29:20.480 that's very easy to to um yeah to get into callback health so let's assume
00:29:25.840 we have like a very simple task yeah we want to to search um for for a word on google yeah
00:29:32.880 so this is um the an api a very simple implementation of an api client to the google search api
00:29:38.880 we're doing an essence request we're loading um the the url and then we're doing a json
00:29:44.159 parse on the on the results um but what happens if we now want to
00:29:49.440 like load the first result yeah so we have inside the successful callback we have now to
00:29:54.559 schedule a second request that loads this first url and then prints out the size of the
00:30:02.080 response for example yeah and once we have the size we don't
00:30:07.440 want to like print it we want to like store it somewhere yeah so let's store it in memcache um so again we have an asset callback so
00:30:13.520 we have to to now call memcache and register callback that once the the savings is done we get notified
00:30:21.120 so it's it's very easy like to get nested callback by call vapor callback for callback
00:30:26.240 which is um at some point like it's very hard to look very hard to read very hard to debug
00:30:31.600 and to understand what is happening especially if you now i'm now having only the successful case yeah only the successful callbacks a good
00:30:39.120 implementation of course would have to handle all errors in between yeah so i have like a very deep nesting
00:30:44.320 of successful callbacks of error callbacks and so on
00:30:49.360 what you can do in order to avoid this is to use fibers so fibers are part of rewind 9 and they
00:30:54.960 are also implemented in j ruby and and rubinius and what they allow you to
00:31:01.120 do is to to make this code look like it's it's synchronous yeah so it's very a lot
00:31:07.200 easier to read and understand but the the um the behavior is the same as
00:31:12.320 before it just reads nicer which means it's nicer to understand and maintain
00:31:19.039 what if you would use fibers in order to to to make this code look synchronous
00:31:26.240 is you have basically to to resume the fiber on on a on a
00:31:32.880 callback and um and yield before you're getting the your response so it sounds complicated but as you see
00:31:39.120 in a minute um using it is very is very very simple if you would do it manually it would look like this you're instantiating a
00:31:44.480 new fiber then you um you're doing the azerone is calling and set up
00:31:50.320 and as a callback you you register a resume of your fiber which means um i want to now take back control
00:31:58.480 in in the fiber because what the fiber allows you to do is to to like voluntarily yield the the control flow
00:32:04.799 in contrast to threats where you like where in threading there is there is a fixed amount of time that you have to run
00:32:10.000 and after that you you're going to sleep and another thread is running with fibers you can you can schedule
00:32:15.519 when those times are so in our case we um we say that um on the last line
00:32:20.960 that we yield the um uh the control back to to the five of
00:32:26.640 the controllers or to the to the main thread um because we're waiting for the response and inside the
00:32:34.640 callbacks we said that they want to resume this control because now we got what we waited for so um the the actual line of code is the
00:32:42.640 last one that that um allows you to do this equivalently so this is the like the complex implementation if you
00:32:48.640 would hide it in a method it would look like this so let's say we
00:32:53.919 we want to get again we want to get an url so we we hi we put all this fiber
00:32:59.360 code inside the method so that the actual the actual api for the for the user looks a lot nicer it says content
00:33:05.679 equals get off url what it will do is it will behave exactly um like our asynchronous example before
00:33:12.880 um because what will happen is as soon as you call the method get your f and your fiber will yield to the
00:33:19.519 main uh to the fiber that controls it or to the main thread and and this code would suspend yeah
00:33:25.279 until the result is is is there of the http call
00:33:30.399 and then your code is in control again which means you can process the result so um the line content equals get
00:33:37.679 off the url and then it could of course do the other processing with it looks like it's synchronous code but in
00:33:43.279 reality you're always like yielding control to um to the main fiber and getting it back once a result
00:33:48.880 is there because it's very uh uh very very ugly to have all of
00:33:54.000 those fibers everywhere in your code and like to abstract away all of this functionality there's a very nice library called
00:33:59.200 uh em synchrony from uh ilia gigoric that does exactly all of this for you so
00:34:05.600 it it's um it's already manipulating all the built-in clients
00:34:10.720 and of event machine to do is all of this exactly so um you just have to instead of doing em.run you say em.
00:34:17.440 synchrony and um you can write the code like like it's synchronous but in reality it
00:34:24.480 will do exactly the same callback and yielding stuff that we did
00:34:29.919 manually before and the difference is that you don't have to see it you don't have to to
00:34:35.440 um to have maintain this deep asset this deeply nested code um
00:34:43.280 the cool thing is as i said it's implemented from for most of the uh i think for for all of the the
00:34:48.560 built-ins and for most of the the big um event machine libraries it
00:34:53.679 already did all of this so for example from for my sql twi to an active record it works out of the
00:34:58.720 box for the em http request for the couple of db um
00:35:05.440 mapper libraries and for vm.jack so you don't have to to do it manually just by requiring em
00:35:12.240 synchrony and calling em sql instead of em run you don't have to you can use the
00:35:17.920 synchronous looking code but in reality it's asynchronously um what you can do with it is
00:35:26.000 go a step further and use for example goliath which is um an evented web framework by the same
00:35:32.800 guys um so by ilya and post rank that um leverages um the fibers and gives you an
00:35:39.280 erect api on top of it that allows you to have a web framework that is invented
00:35:44.800 by implementation but looks like it is synchronous so in this case we're very very simple
00:35:50.320 um yeah application that just responds with hello world
00:35:55.760 but we can of course have um one that is a little bit more complex and in our case we're doing
00:36:03.040 um we're doing like a blocking my sql query in it but the cool thing is it's not blocking
00:36:08.960 yeah because the um if this would be a rails action the
00:36:14.320 rails process would be hanging for five um seconds in goliath um you can just hammer it
00:36:20.000 um in parallel and will still respond to you because the the fiber that is so what glass is
00:36:25.599 doing is it's implicitly wrapping every action call in a fiber so that
00:36:30.880 um that whenever you do something that is that is like blocking or you're handling control back because
00:36:37.040 you're waiting on a response it will yield and another fiber can run so in
00:36:42.160 this case i can answer other requests still while waiting for this five second block and contrast if you would use rails with
00:36:48.160 it rates will now block and no other requests will be handled because we're waiting for this mysql
00:36:53.200 query yeah so if you have um something that that um
00:36:59.839 that if you have an application where you need a very high throughput and a lot of small requests but you want to do a lot
00:37:05.839 of them in parallel you should definitely check out goliath
00:37:12.720 yeah this is basically it
00:37:18.079 what i wanted to say in wrapping up is what you should definitely remember out of this talk what i hope is is that
00:37:24.800 event machine is a great library um but you have to always make sure as a friend of mine a colleague of mine
00:37:30.000 matthias is saying do not block the event loop yeah he's like yelling at uh constantly but it's
00:37:35.280 very unfortunate it's very easy to to do so yeah it's you have to be very careful not to block it and sometimes
00:37:40.640 even if you do everything correctly yeah if you use only evented libraries you you make sure to use like deferables and
00:37:48.000 stuff like that it's very easy to and to just do too much work because you didn't test it with like thousands
00:37:53.760 of messages processing it at once or so to uh to break it
00:37:58.800 so you have always to remember to to not have any code that runs too long so like to limit yourself do a like em next tick
00:38:06.640 like yield um and uh try again like in the next iteration and um the the the two important things
00:38:13.520 are to like to know what em next tick and what emd4 does and to know where to use one or the
00:38:18.640 other which means i o and very small fast operations in the main reactor loop with the amnesic
00:38:24.160 stick and anything that is potentially longer running with background threats with defer and the
00:38:31.440 the all the libraries like the iterator and the queues and channels help you to to structure your code like this so that
00:38:36.800 it's a lot easier to use and um i am the i think with fibers
00:38:43.760 and and em synchrony it's a lot easier to use and i think that you will see more and more applications
00:38:49.040 using using um event machine behind the scenes but it will look um synchronously with em fiber
00:38:56.560 yeah um are there any questions yes please do you have any
00:39:02.560 advice about testing these applications yes uh testing um so testing is is and
00:39:07.920 can be a little bit of a challenge um because testing means you have to have a running loop so
00:39:15.440 what we usually end up doing is test all the the the domain logic which
00:39:21.680 is hopefully hidden well away in classes in normal unit tests that don't rely on the
00:39:26.880 event look to be running where we sometimes just um stop for example um like defer calls or timer calls like
00:39:33.839 fire immediately and then um in your tests that that test that you're calling the correct methods
00:39:39.359 of the correct classes in in the in this in the correct cases um there we we already we
00:39:44.560 actually fire up the event loop and um and which means that usually you you
00:39:50.400 would um set up assertions to run um with a timer so for example my test would would um would wrap it so
00:39:58.320 it would wrap yeah em run then you would um call your the methods under test like
00:40:04.160 the instantiated class and call the message that you're interested in and have a timer that is scheduled
00:40:09.200 like to run in in like a second or so that has the assertions um so with with an
00:40:15.920 approach like this you can you can test most of the of the stuff um
00:40:21.040 for uh for for more complex um interactions what we do is we we have
00:40:27.440 like a real integration suite that that actually fires it from the from the outside so you're observing the the behavior of for
00:40:34.240 example are those messages processed correctly and do i get like a different thing in my database
00:40:40.160 where you're not like interested in the in the um implementation of of the agent of the
00:40:46.079 daemon that you're using you're just like testing the response of it so you i'm firing a message into rabbitmq and i
00:40:52.160 say like five seconds later there should be in the database something that looks like this
00:40:57.920 and how it implemented it or how it got there i don't care um and with an approach like that which
00:41:02.960 is like like the the broader phase the integration testing approach um we're pretty pretty happy it works
00:41:09.280 very nicely the only the only drawback is of course that um at some point it takes it takes a long time to run all your test suite
00:41:16.160 but if you if you nicely separate the the domain logic and heavily used for example deferrables
00:41:22.079 which make it very easy then to have a clear interface that you can mock it's very easy to to have a unit test for the domain models and for the
00:41:28.720 for all the domain logic and then have a very lightweight test suite that tests that that event machine setup code does the
00:41:35.040 right thing and and at the right places um yeah another question up here
00:41:45.920 exception handling was the question was how do we handle exceptions um unfortunately error handling in general
00:41:51.920 in event machine is of course not the greatest because it's it's like sometimes deeply nested
00:41:58.560 um so what what uh and you have to make sure that you're not killing your loop by by having like an
00:42:03.680 an exception that you're not catching so and in general the code that we use in the event machine is is very very narrow so what we tend to
00:42:10.079 do is to have um small agents that have a very narrow functionality
00:42:15.200 or responsibility and that ping-pong the work through through a message bus for example so
00:42:21.680 that the the part that we have to test and handle is very small and then those agents have um a pretty
00:42:28.480 rough general error handling that um that then uh worst case the agent is dying and we have to respawn
00:42:34.480 another one but um in general the the um especially if you if you use um like the fibers the error
00:42:41.760 handling becomes a lot nicer because it's um it's not as deeply nested as
00:42:46.800 before so you can you can like check response values and so on so we we started to um to use fibers more
00:42:54.079 and more just for the for the nicer error but handling error handing is still still a case that that is not very great
00:43:00.480 especially if you get an exception somewhat deep down the the um out of the reactor because for example um a socket closed
00:43:07.520 for um so an example is if you use um somewhere a library that's not thread save for
00:43:13.599 example and you call it out of different threads um you can get like a weird exception out of
00:43:18.640 all the reactor because um because yeah one thread tried to to write on a closed socket or something like that so
00:43:24.000 um this is definitely a case where you have to take care of it's not so easy to do
00:43:29.760 with event machine if you have very deep nested callbacks um and you have to take care of error handling on
00:43:35.440 all levels so my advice would be to limit the nesting and slicer responsibilities have very
00:43:40.640 narrow responsibilities so you have defined interfaces um yeah another question
00:43:49.440 no other questions okay so thank you very much
00:44:20.839 do
00:44:35.839 you
Explore all talks recorded at RubyConf 2011
+55