Summarized using AI

The Celluloid Ecosystem

Tony Arcieri • November 01, 2012 • Denver, Colorado • Talk

Summary of "The Celluloid Ecosystem" by Tony Arcieri

In this talk, Tony Arcieri discusses Celluloid, a concurrent object framework for Ruby designed to simplify the development of multithreaded applications. He introduces the various components of the Celluloid ecosystem, including the Reel web server and other subprojects aimed at handling different concurrency and networking needs in Ruby programming.

Key Points Discussed:

  • Importance of Multithreading: Arcieri emphasizes the significance of multithreading in modern programming, especially with the increasing number of CPU cores. With hardware no longer focusing solely on increasing clock speeds, utilizing multiple cores becomes essential for maximizing performance.
  • Celluloid as a Concurrency Framework: He likens Celluloid to "threads on Rails", providing Ruby developers with a robust framework for building multithreaded programs without the previous complexities associated with concurrency in Ruby.
  • Core Features of Celluloid:
    • Active Objects: Arcieri mentions the concept of active objects which can run in threads, facilitating a more natural approach to concurrency in programming.
    • Actor Model Influences: Many ideas stem from the actor model, traditionally used in Erlang, which inspires Celluloid's architecture allowing objects to communicate via message passing.
  • Real-World Applications: He provides examples of projects that successfully implement Celluloid, such as Sidekiq and Adhesion, illustrating how it positively impacts concurrent programming in Ruby environments.
  • Error Handling with Supervision: One of the significant advantages of using Celluloid is its built-in error handling mechanisms via supervisors, allowing developers to manage failures effectively and maintain application robustness.
  • Subprojects in the Celluloid Ecosystem: Arcieri discusses several subprojects:
    • DCell: Facilitates distribution across networks, enabling connected Ruby virtual machines to communicate efficiently.
    • Celluloid::IO: An alternative to event-driven libraries like EventMachine, designed to simplify input/output operations.
    • Reel: A web server built on Celluloid::IO, optimized for handling WebSockets and stream connections with lower latency and overhead.

Conclusion and Takeaways:

Arcieri concludes his talk by encouraging developers to leverage Celluloid and its components for building scalable and maintainable concurrent applications in Ruby. He highlights the ease of integration of new projects into the Ruby ecosystem and invites the audience to explore the possibilities offered by Celluloid and its subprojects. This framework, along with its tools, aims to bring the future of multithreaded programming to Ruby developers effectively.

The Celluloid Ecosystem
Tony Arcieri • Denver, Colorado • Talk

Date: November 01, 2012
Published: March 19, 2013
Announced: unknown

Celluloid is a concurrent object framework for Ruby that takes the headache out of building multithreaded programs. However, it's also an ecosystem of subprojects including the Reel web server for WebSockets, Celluloid::IO for evented sockets, Celluloid::ZMQ for ZeroMQ sockets, and DCell for building distributed Ruby programs. This talk will examine all of these components and help you decide which ones to use in your multithreaded Ruby programs.

RubyConf 2012

00:00:15.320 I think I'm actually ready to go here uh so I am Tony areri and this is the
00:00:21.519 Celluloid ecosystem so really quick how many of you have heard of Celluloid
00:00:26.960 before you came to Ruby everybody in the room practically that's pretty awesome
00:00:32.079 all right so uh I work for Living Social on our site reliability team and we're
00:00:38.680 hiring just in case you didn't know that uh if you don't know Mr Chad
00:00:43.800 fower so I like to think of Celluloid is threads on Rails so for a long time in
00:00:49.640 Ruby there were no sort of abstractions for building multi-thread programs kind
00:00:54.920 of like before rails there weren't really good uh web Frameworks for building uh like web applications so I
00:01:01.960 like to think is Celluloid is like the quintessential uh concurrency framework
00:01:07.360 for Ruby so uh what you might be wondering is why should I care why should I use
00:01:13.920 threads why are threads good uh the main reason is multicore right so in the past
00:01:21.320 basically we got speed for free we could wait for all the hardware people to just keep cranking up the clock right there
00:01:28.159 and uh you would just sit back in CPUs would get faster and therefore your code would get faster also in the past
00:01:35.640 threads used to be really expensive so uh Linux for example it took them a long
00:01:40.880 time to get a constant time scheduler that could handle as many threads as you wanted and have no overhead as you added
00:01:48.040 more threads and that whole situation has changed so the hardware desires of
00:01:53.880 Hit the power wall they just can't keep cranking up the clock speed like they used to but what they can do is keep
00:01:59.920 adding CPU cores so what we're seeing now is uh exponential growth in CPU
00:02:06.079 cores uh operating systems are getting good schedulers and threads you know are
00:02:11.920 getting cheap in the same way that uh Performance used to be cheap right you could just sit back and wait for more
00:02:19.319 and more CPU cores so I strongly believe the multi-or is the future uh this graph
00:02:26.080 definitely illustrates that pretty well uh see speed not going up that number of
00:02:32.720 CPU cores are going exponentially up so maybe you're thinking I can just
00:02:38.440 throw more and more V VMS at the problem like I'll get multi-core performance that way
00:02:44.040 right uh this waste Ram uh this is a complicated topic but I think Mike
00:02:49.959 addressed it in his talk if you happen to see that the psychic talk he did a very good job describing why that's a
00:02:56.480 problem there's also a serialization penalty so if you have uh different
00:03:01.840 concurrent parts of your system talking to each other and they're having to do IPC there is a definite penalty to that
00:03:10.360 uh I would definitely recommend checking out this article uh by Kyle AER
00:03:17.760 there so my question to you is when we have 100 core CPUs which will probably be before the end of the decade here uh
00:03:25.360 are you going to run 100 Ruby VMS or you just going to run one I think it makes a lot more sense to just run one at a
00:03:33.360 time so there are uh so Celluloid isn't a science experiment there are people
00:03:38.720 actually using it in the real world so a couple of projects here they're using uh Celluloid sidekick uh if you saw Mike's
00:03:46.560 talk it's a badass job execution engine there and also
00:03:51.879 adhesion uh so Celluloid is kind of inspired by llang uh erlang was
00:03:57.040 originally created to Du telephony but there's a pretty cool telepan framework in Ruby called adhesion and that's using
00:04:04.959 Celluloid to kind of do the same stuff they uh we're originally doing an earling which was command and control of
00:04:11.879 uh telephone calls so I definitely recommend checking out both of these projects they're both built using
00:04:19.040 Celluloid so Celluloid is a combination of objectoriented programming and the actor model uh let me show you this
00:04:27.919 quote it's a little bit cut off there maybe I'll read it to you because it's cut off uh but this is by Alan K it's
00:04:35.240 like one of my favorite quotes ever he says I thought of objects being like biological cells Andor individual
00:04:42.199 computers on network only able to communicate with messages so when I see this quote I
00:04:49.320 think of objects as being a really natural way to do concurrency right so we have multiple servers out there
00:04:55.720 talking about Network packets uh we have cells exchanging chemicals all those things are rang in parallel
00:05:02.039 right like multiple servers multiple cells all that it's all happening in parallel and they're communicating using
00:05:08.960 a messaging system so I think objects are a good way to do
00:05:14.440 concurrency so what cellula does is combine these object oriented tools that we're all familiar with uh classes
00:05:22.520 inheritance is a specifically uh important issue in concurrent programming uh there's
00:05:29.280 actually a whole concept of The Inheritance anomaly where basically if you build concurrent programs and use
00:05:35.360 inheritance there's all these places where inheritance leaks so if you can build a system that does inheritance
00:05:41.520 well I think uh inheritance and concurrency well I think you're doing something right uh so yeah and messages
00:05:48.319 are a very fundamental Concept in object-oriented programming then we have this whole other slew of uh concurrency
00:05:55.319 tools right they're completely different from the tools we ordinarily use to structure PR programs so I think these
00:06:02.000 really need to be combined into a single Universal abstraction the idea is the idea of an
00:06:09.479 active object so uh you have normal passive objects but you can also have
00:06:15.160 objects that are actually running inside their own thread and uh particularly the
00:06:21.039 idea is these objects are built on the actor model which I'll go into in a bit later uh so I call this abstraction a
00:06:28.759 cell uh you'll see Celluloid talk about in the documentation I talk about actors uh
00:06:36.240 really Celluloid is like a on top of actors so I like using cell to kind of
00:06:41.759 differentiate that from the typical actor you might find in language like erlang so erlang is definitely one of my
00:06:50.039 major Inspirations for Celluloid uh these guys were into concurrency before it was
00:06:56.520 popular uh it's Joe Armstrong the creator of earling there but uh basically everything all all the major
00:07:04.199 ideas I got for Celluloid uh came out of earling and the central idea is the
00:07:10.759 actor model the actor model may sound complicated but I think it's actually pretty simple so you've just got these
00:07:17.879 things like they can be any type of computational primitive uh it's easy to
00:07:23.199 think of them is like a thread or something like that right and each of them is an address and if you have their
00:07:29.000 address you can send a message uh and actors can create other actors and that's really
00:07:35.039 all there is to the actor model I think it's pretty simple so I'm certainly not the first
00:07:41.280 person who has tried to put the actor model together with objectoriented
00:07:46.440 programming in fact uh pythons did it uh
00:07:51.680 so there was a very similar framework that was developed as a uh sort of like
00:07:57.639 doctoral research project in Python called adom and when I found
00:08:03.280 this I'm like holy crap these guys created Celluloid like you know over a decade before I did uh they created it
00:08:10.199 in 1997 which was kind of a problem for them I think because computers really
00:08:16.159 suck back then you know like here's 233 MHz pennium with like 32 Megs of ram
00:08:22.479 awesome uh you know so I think they were like really ahead of their time and the big
00:08:28.879 problem there try to saw back then was how do you do client server stuff right you got to stay on the server you want
00:08:35.640 have a user interface for the client how do you put those two together and the web came along and solved that whole
00:08:41.599 problem so really people just kind of stopped uh looking into this I think
00:08:47.440 it's sort of this forgotten approach to concurrency like there was tons and tons of research into combining the actor
00:08:54.120 model and objectoriented programming in the late ' 80s and early 90s and the web
00:08:59.640 came along solved the client server progam problem and that all went away but the thing is now we got multicore
00:09:06.720 computers so I think it's time to come back and re-evaluate this whole
00:09:12.440 concept so how does Celluloid work I've got a little example here uh this
00:09:18.000 actually comes out of a rails cast uh about Celluloid I strongly recommend you watch that but basically we got a normal
00:09:26.040 Ruby class here it's got a method there the only uh the only difference between
00:09:32.040 this and normal Ruby class is we include Celluloid and that promotes any object
00:09:38.959 of this class into a concurrent object just simply by including
00:09:44.200 Celluloid so when we create a new instance of this class what we get back
00:09:49.480 isn't just an object it's actually a Celluloid actor that kind of wraps that
00:09:54.680 object so uh first thing I'm going to talk about here is synchronous call
00:09:59.880 singer calls work just like your normal Ruby method invocations so you can call that launch
00:10:06.040 method there what it's going to do is the little countdown if you remember it had to sleep and then it's going to
00:10:12.600 print that blast off all right so cous calls so basically what's going on here whenever you send a method
00:10:20.920 to that little handle you get back it's actually a proxy object it's going to translate that into a actual message
00:10:28.480 that's sent to this thread in a fully thread synchronized way so those little Chevrons there are uh actual
00:10:36.360 objects they being passed back and forth between threads uh so everything synchronizes around those Celluloid
00:10:42.959 mailboxes there so gets the request it processes it and it sends the result
00:10:48.519 back so things start to get interesting is if you do asynchronous calls so you can do uh async do launch
00:10:58.680 uh so if you're if you've seen Celluloid before this syntax might be new to you
00:11:04.680 uh this was introduced in Celluloid 0.12 which is the latest major
00:11:10.279 release so the main complaint I've gotten about Celluloid and its API was
00:11:16.360 that Celluloid used to hijack bang methods to be uh asynchronous uh so people got really mad
00:11:23.560 at that and I kind of agree with them so in Celluloid 1.0 I'm getting rid of that
00:11:30.040 old syntax and everything will use the async do style syntax
00:11:36.880 so awesome so when you call this async do launch what's going to happen is
00:11:42.399 that's going to return immediately but even though it returned immediately that method's still going to
00:11:47.880 run is kind of running there in the background right so ASN calls are just straight
00:11:53.959 through you're just sending a message to the object and it's going to process it whenever it gets around to it
00:12:00.639 and async methods give you this sort of easy parallelism right so we can create two of these
00:12:06.880 launchers and call. as. launch on both of them what's going to happen is each
00:12:12.800 of those are going to run in parallel so you may be wondering like
00:12:20.199 okay I can call method async what if I actually care about the return value you know what if I want to know what it is
00:12:27.320 uh so there's a separate feature called Futures I think a good metaphor for this is like calling ahe to a restaurant to
00:12:33.920 order your food so when you show up it's ready and you're not going to the restaurant and ordering your food and
00:12:39.160 waiting right it's got another example here this
00:12:44.839 is your basic non-close form like Ted Duba style uh Fibonacci function there
00:12:51.920 uh so what we're going to do is call a future on that and that future is going
00:12:57.800 to return immediately just kind of like that async method right it's going to give you the sort of placeholder object
00:13:03.600 you can use to get the value so you can call that future on that it's going to block until it's
00:13:10.800 complete then it's done it's going to give you the result so Futures are kind of a more
00:13:17.360 complicated version of a synchronous call so basically you have an extra object there to wait for the
00:13:25.399 value so putting this all together here we have pools so tools let you uh create
00:13:31.760 a thread Pole to do work so uh that original class I was using in the
00:13:37.240 Futures example instead of calling new you can call. poool instead and that
00:13:43.480 will give you a pool of in this case 16 uh threads that you can call
00:13:49.959 into and you don't even have to specify that by default it will give you one actor per CPU so yeah it will just
00:13:57.839 automatically scale to ever many CPU cores you have so the general pattern you're going
00:14:04.360 to want to use when you're uh doing Futures if you want to compute a bunch of stuff in parallel and then get the
00:14:10.199 results back uh so we can map across a bunch of numbers here get a future for
00:14:16.240 each of them and then we can map across the result of that and get the value so that first map is going to kick off all
00:14:22.480 the computation uh it's going to do that in parallel at least if you're on AVM like
00:14:27.639 J Ruby or Ruben uh which would be my recommended VMS for
00:14:32.680 using Celluloid simply because of the Gill on MRI but you can still do parallel IO there but basically this is
00:14:40.160 going to kick off that computation and then when you map across those values you're going to get all of them back at
00:14:46.040 the same time so uh something that's really
00:14:51.279 difficult to deal with if you've uh built multi-threaded programs in Ruby yourself is what you do if any of your
00:14:57.560 threads crash so so basically you can write a multi-thread program you kick off a thread and it crashes and then
00:15:03.680 your program is like basically broken unless you have some sort of Crash Handler so this is again where I Look to
00:15:11.040 earling for the solution to this problem and the basic idea is uh
00:15:18.639 yeah so uh your thing crashed you just restart it right so Celluloid builds in
00:15:25.600 uh fault tolerance so Celluloid has this idea of Supervisors and supervision
00:15:31.360 trees so you can model your whole application as a hierarchy of components
00:15:37.120 uh some of them may crash and when they do uh the basic philosophy is just let them crash right you don't do a whole
00:15:43.920 bunch of error handling you don't try to anticipate all the possible errors ahead of time uh you just let crash you can
00:15:50.519 actually link interdependent components together and sure they all restart together in a clean
00:15:57.519 state so talk is called the Celluloid ecosystem uh I just wanted to give you a crash course on Celluloid there but what
00:16:05.160 I actually want to talk about are these three sort of Celluloid sub projects uh which there's what each of
00:16:13.120 those are so there's diesel uh it's distribution across a network so you can
00:16:18.240 have several VMS running Celluloid talking to each other uh Celluloid iio
00:16:23.319 this is kind of an alternative to event machine I'd like to think uh it just vented IO but it gets you out of
00:16:29.360 callback hell basically so the whole API is synchronous and finally I want to
00:16:34.839 talk about real which is a web server I've written built on Celluloid
00:16:40.440 IO so first of these is Dell which was distributed Celluloid uh if you've been
00:16:46.120 tracking this project you might notice it's uh been having a little bit of trouble the build is still failing uh I
00:16:54.000 haven't gotten the linking I was talking about for the fault tolerance uh working
00:16:59.079 you know distributed scenario based on the latest version of cellid uh so it is
00:17:05.000 a little bit bleeding edge uh I have pushed a version of diesel to ruby gems
00:17:11.439 you can install with-- pre so you can grab this pre-release of dool which is
00:17:17.160 everything working as far as I know except the linking so if you want to play around with it you can install it
00:17:22.600 straight from Ruby jemes the basic idea is that each of these cells is a sort of service that
00:17:28.640 you should be able to expose to the network if you so
00:17:34.679 desire uh so this is built on top of zerom Q everybody heard of zerom Q
00:17:40.360 probably uh okay so zerom Q is just a really cool way to do uh buffering of
00:17:47.000 messages it it's basically a uh breakerless message Cu uh dool Maps
00:17:53.000 everything on to push and pull sockets if you're actually familiar with zerm Q there and how it works
00:17:59.400 and uh I built a separate gem called Celluloid zmq which is actually built on
00:18:04.880 top of another gem called ffi rzm Q uh by Chuck re it's pretty
00:18:10.919 cool so what would you use diesel for uh here's some basic use cases so the main
00:18:17.320 one is i' I've talked to a lot of uh like Ops Code people about using this for you know they have like uh their Ops
00:18:24.240 Code agent I mean I still really like this idea of having agents on systems so
00:18:29.440 you can just tell them what to do and they go do it you still like the cistron
00:18:35.280 approach of like SSH into every box uh it seems a little silly to me uh I I
00:18:42.159 want this to be like a cool solution for building service Orient architecture as I work at Living Social and we do that I
00:18:48.640 wouldn't actually recommend going out and doing that right away but that would be the end goal uh I think would be
00:18:55.200 pretty cool also uh asynchronous background job processing so um you know I think it
00:19:02.280 would be really cool to have a system somewhat similar to psychic or maybe psychic itself that uh can do like
00:19:09.320 leader election type of stuff so you could actually uh you know have any part
00:19:14.799 of the entire system fail and have no single point of failure in a distributed uh background job
00:19:21.600 system and here's some diagrams of how we build applications today uh I think
00:19:28.280 uncle Bob had a post with a very similar diagram to this and basically when I
00:19:34.440 look at this it's like you're doing your work in triplicate right you're building a rest client to a rest server and
00:19:41.360 really all that's trying to do is expose a domain object that already knows everything you want to do so if you have
00:19:46.679 Ruby talking to Ruby in this case this seems like basic this isn't dry right
00:19:52.360 you're you're tripling the work you actually need to do to build a service I think it should really work
00:19:59.280 more like this where you can just talk directly to those domain objects across the
00:20:04.320 network here's another uh one of my favorite quotes this comes from Mr Steve
00:20:09.640 Jobs there uh so Knox had a system somewhat similar to corba and
00:20:16.280 soap I mean I still I I think this is a really good idea I think Steve Jobs was
00:20:23.200 doing a pretty good job of explaining it here but this dream just has not been quite realized yet right like so
00:20:30.120 distributive objects in practice have largely been a failure here here's a big
00:20:35.840 list of uh stuff we probably all universally revile I mean I I like drb
00:20:41.360 drb is cool but nobody actually really uses it I think in practice I mean a few
00:20:46.840 people do but not really so my question is like why why why did all of these uh distributed
00:20:53.760 object Frameworks fail and uh having gotten a little bit of experience in
00:21:00.240 distributed systems programming the main thing I've learned is you need asynchronous protocols so if you're
00:21:06.880 building something like say paxos which is a distributed consensus protocol you need an asynchronous protocol to build
00:21:13.640 this on top of and all those uh protocols in my last slide are synchronous so you just can't build
00:21:20.159 systems like this so they're not built on the actor M the actor mall is asynchronous right
00:21:26.400 it's all built around sending messages so I think uh I think the actor model is
00:21:32.919 pretty awesome I think the actor model can actually fit all these distributed systems patterns and even more awesome
00:21:41.000 is it gives you this sort of unifying abstraction so you want to build concurrent programs you want to build
00:21:46.240 distributed programs you want to move little parts of the system around you want to take one that's running in the same VM put on a different server that
00:21:53.480 should be really easy to do and this isn't just a dream it has
00:21:58.960 been pulled off so distributed erlang has actually been successful at
00:22:04.000 doing this unlike any of those other systems I think so several things are
00:22:09.600 built on the uh distributed earling protocol uh riok the distributed
00:22:15.159 database comes to mind uh if you're familiar with boundary they built entire system around uh you know an
00:22:22.480 implementation of distributed earling on the jbm called scaling which is pretty
00:22:27.679 cool uh they might have a few gripes with distributed earling I think it makes a really good
00:22:32.880 command and control protocol uh same with Celluloid right you want to use this for command and control you don't
00:22:38.640 want to use this for actually streaming like tons and tons of data around but for command and control I think this
00:22:44.840 makes a ton of sense so here's a quick example of
00:22:50.360 diesel uh so if you grab that pre- release gem uh thing is you're going to
00:22:55.559 need zookeeper uh zookeeper maybe not everybody's favorite
00:23:01.559 tool in the world so well let me just talk about zookeeper a little bit uh
00:23:06.600 zookeeper provides a total ordering of events in a distributed system so this is actually a very hard thing to do I
00:23:12.919 mean this is all zookeeper actually does it gives you order in a distributed system so effectively zookeeper's way of
00:23:19.799 doing transactions and it's doing it in a fa T it matter so what sort of stuff do you need
00:23:25.960 that for uh dool uses it for it's node registry so it's knowing where all the
00:23:31.799 uh nodes in your distributed Network are uh also if you have global data diesel
00:23:37.559 supports this so if you have like configuration data you want to share with all the nodes uh it will store that
00:23:43.720 for you you can do distributed locks and uh this isn't like a it does leader
00:23:50.760 election this isn't really built in but you can build your own uh fault tolerant
00:23:56.120 leader election protocols on top of Z keeper so I didn't do that myself
00:24:01.840 fortunately there's this really cool ZK gem uh that will do a lot of stuff for
00:24:07.720 you uh particularly it implements a uh leader election protocol which doing
00:24:14.760 that is somewhat non-trivial so I'm glad uh you know
00:24:19.840 other people are solving some of these problems for me so installing zookeeper
00:24:25.360 really isn't that hard especially if you use diesel you can just clone dool uh and then there's these little
00:24:31.960 rake tasks that will actually install and start zeper for you uh if you run it
00:24:38.240 you should see that and everything should be good to go so uh I have a little uh example here
00:24:45.960 I'm going to have two nodes in my uh dool Network we could call one itchy and
00:24:51.440 the other scratchy so basically uh we give these
00:24:56.520 effectively the same configuration uh by default it's going to use zookeeper it's going to connect to your
00:25:02.039 local zookeeper so you don't really need to configure that but what you do need to configure is a node ID and uh address
00:25:10.679 and Port where you want dool to run so you're going to do the same basic
00:25:16.039 thing on the other node uh the only difference is you're going to give it a different node ID and a different port
00:25:23.399 number so to find the other nodes in the system you can go to this D node class
00:25:28.640 class and it basically works like an array you give it a node name it will give you back the node uh you can ask
00:25:35.240 for all the nodes in the system and you can also ask for the current node which
00:25:42.360 Isme so to find uh the remote SS uh you can register them with a name I'll go
00:25:49.279 over this in a little bit here but uh you look up the node and then you look up the service underneath the node
00:25:56.440 effectively so by default every uh Del node runs this info service that gives
00:26:02.600 you some uh cool basic information about the system here so we're asking this
00:26:09.159 node about its info there and we're on jruby you know Java 167 or on a core i7
00:26:17.320 there gives you all this basic information about it and then you can invoke methods on it just like it's a
00:26:24.039 regular object but every time you do that what it's actually doing is it's going over the network over
00:26:30.799 zerq talking to that info service going what is your up time or what's your load average or whatever and that's sending
00:26:37.080 the response back over the network so ends up working uh a lot like drb in
00:26:42.200 that way so to Define your own service here
00:26:47.320 uh here's another really basic Ruby class right it includes Celluloid but all it's really got is that hello
00:26:54.799 method uh so cellul has this actor registry uh it's again kind of like an
00:27:01.679 array so you can assign uh the name greeter to a particular instance or a
00:27:07.840 particular cell a particular actor there and then invoke a method on
00:27:16.279 it see all right so um all all the feature except for linking which I
00:27:22.279 mentioned earlier uh which I will get fixed really soon I promise um all the
00:27:28.520 basic features of Celluloid are supported by diesel so you can do the synchronous calls like I just showed you
00:27:34.960 but you can also do asynchronous call so if you want to build asynchronous protocols uh you that works just fine
00:27:41.120 Soo features so basically the entirety of Celluloid is exposed out dool and you
00:27:47.480 can use a features you normally would for doing stuff in the same virtual
00:27:52.880 machine uh so next little part I want to talk about is Celluloid IO uh so yeah
00:27:58.640 this is the thing that's somewhat similar to event machine right so now we
00:28:03.840 have like a nice substract about around threads uh what do we do for iio right
00:28:10.159 so my canonical response is just use blocking IO for the love of God uh so I
00:28:17.760 mean I I have uh I used to be a vent machine contributor and I've used a lot of event machine projects and they have
00:28:26.120 mostly been a giant pain in my ass uh so if you want things to be simple
00:28:32.799 and easy to reason about just use blocking IO just use threads it's great
00:28:38.000 blocking IO is okay you're not going to in Celluloid you're not going to block any sort of central event Loop like you
00:28:44.640 would in a vent machine or node or anything right like each of these actors is Runing its own thread so you're
00:28:50.399 totally free to block um there are a few cases where this
00:28:55.880 can't bite you in the ass though so so you're talking to a database you ask the database like give me a lock on this and
00:29:02.200 you run into like a locking problem in your database uh you know basically you've got some other like locking bug going on
00:29:09.720 in external service this can creep out in Celluloid so you do need to be
00:29:15.000 careful with blocking iio uh the other important thing if you're making a blocking call in an actor uh and you try
00:29:21.799 to send it another request right it's going to wait until the original blocking call finishes before for it
00:29:29.039 will service your request so uh there is a way around this though if you want to have uh a combination of
00:29:37.039 IO and actor messages and just have everything sort of like seamlessly
00:29:42.200 multilex inside that actor so uh the way we do this is a vented IO so uh you want
00:29:50.039 to do this probably if you have if you anticipate a large number of connections right so if you have a small number of
00:29:56.399 connections I would definitely recommend like doing an actor per connection but
00:30:01.559 cellid iio let's you service you know I I haven't actually tested it with tens
00:30:07.120 of thousands of connections but the idea is you can have multiple connections serviced by a single thread and the use
00:30:14.679 case for this really makes sense is if you have mostly idle connections right so if you have like a chat server would
00:30:19.880 be like the canonical example um and you want to do this when you have an IO bound problem right so
00:30:27.480 you don't want do a bunch of computation this thread because it's going to block you from servicing all those other
00:30:32.840 connections and this is sort of like a general problem with new an event machine uh and websockets are like a
00:30:40.760 really good example I'll talk about those a little bit later so the basic idea of cellul iio is
00:30:47.679 that these actors these cells each of them is event Loop so it's singular are processing messages and this is really
00:30:55.320 similar to the type of event Loop you would use in event machine or node except in those systems you only get one
00:31:01.880 and Celluloid gives you as many as you want so uh your normal actor this is how
00:31:08.519 it actually works you how your Celluloid actor it's built on this Celluloid mailbox inside there is actually a
00:31:15.880 condition variable and that condition variable is what the actor is actually blocking on when it's waiting for
00:31:22.039 work uh so what Celluloid iio does is uh
00:31:27.120 Celluloid has a sort of dependency injection API where you can swap out the mailbox so it has its own mailbox with
00:31:35.559 its own reactor and this thing actually waits for messages using a pipe so by using a pipe it can Multiplex
00:31:44.000 incoming requests if you want to do synchronous asynchronous calls whatever uh it uses the pipe to signal that if
00:31:51.039 you want to do IO it can wait for that at the same time so can Multiplex those
00:31:56.320 things together so do that uses this other Library I wrote called nio for R
00:32:02.760 so Java has this uh API called nio that gives you selectors selectors are kind
00:32:08.840 of the heart of a reactor right so uh this is the thing that's wrapping system
00:32:14.039 calls like eole and KQ and that kind of thing right so yeah so it's available on my
00:32:22.559 GitHub there uh it's aspired by Java nio uh I don't
00:32:28.440 like maintaining a ton of code even though I appear to be writing a bunch of it I'm talking about right now but uh I
00:32:34.360 I try to keep the API really small really simple uh should be easy for any Ruby
00:32:40.399 implementor who wants to build their own fast version of this specific to their implementation uh they should be able to
00:32:47.080 do that so I wrote two back ends to this uh one's built on Libby V uh so I don't
00:32:55.440 know if you're familiar but I wrote another BV library for Ruby call culio
00:33:00.799 kind of another event machine alternative I gave up on because I think Celluloid iio is better way to go uh I
00:33:07.559 hate callback hell I hate callbacks I hate defera bles I hate all that stuff
00:33:13.279 so I I like these nice scan synchronous apis but so this is the second Ruby
00:33:18.840 binding to lib EV I ever written as it were uh so I also wrote a Java extension
00:33:25.000 for J Ruby that talks directly a Java nio to do this stuff to you and for any
00:33:31.240 of you who aren't using C rubies or things that support the C extension API
00:33:37.399 like rubinius uh there's also a pure Ruby version that just uses uh kernel select
00:33:43.960 so I think I've got all the VMS covered with this although I would love for VM implementers to work with me on this to
00:33:51.000 try to get like native versions to their VM uh so the core part of
00:33:58.519 iio uh I mean it's mainly for doing sockets so uh exposes TCP sockets UDP
00:34:05.919 sockets uh so Celluloid iot TCP socket I'm trying to build a uh evented dock
00:34:14.280 type of TCP socket itself so uh this uses fibers to defer
00:34:22.280 that so uh sort of similar to GL if you're familiar with that so gives you synchronous API but underneath the
00:34:28.800 scenes it's still evented nonblocking all that good stuff right and uh the
00:34:34.839 cool thing is you can hand a Celluloid iot TCP socket to like anywhere else in
00:34:40.919 the system so you could give it to your any other thread maybe a thread that
00:34:45.960 isn't a Celluloid IO actor it will just seamlessly do blocking IO for
00:34:52.000 you so the gist of this whole thing is you can have vented IO and threaded IO
00:34:59.720 you know you don't need everything to be invented in nonblocking you don't need to reinvent the entire world in nonblocking manner you can have them
00:35:07.800 both at the same time you can have your cake and eat it too right so really quick here is a echo server example uh
00:35:15.720 using Celluloid IO so the main difference here is instead of including Celluloid he includes cellid iio and
00:35:23.359 otherwise this looks pretty much like uh you know the same sort of thing you
00:35:28.680 would write with the core TCP socket apis uh so you see that little comment
00:35:34.599 there uh Celluloid IO defines its own replacements for all of the core Ruby
00:35:41.400 classes so when you're asking for TCP server TCP socket or that kind of thing
00:35:47.640 uh it's looking in the cellul iio AES space it sees those so it's not using the core uh like core Ruby stuff it's using
00:35:54.640 its own replacements for all this stuff so hopefully is kind of like a drop in replacement for doing blocking IO all
00:36:01.440 you got to do is include cellul IO so finally here I'm going to talk
00:36:07.400 about real so I wrote a web server built on Celluloid IO um It's relatively fast
00:36:16.599 here these are actually a bit dated but uh you know it's relatively fast right
00:36:23.079 it's relatively fast relatively L latency uh here's some numbers for comparison so thin is about uh like 50%
00:36:31.680 faster there node is approximately twice as fast that kind of thing but I'm still beating Goliath which is probably like
00:36:39.319 the closest uh analog to real and it's got web sockets so let me
00:36:46.839 see if I can possibly demo this to you might be a little bit hard here because
00:36:52.599 I can't really see what's going on
00:37:00.680 I want Arrangement mirror no I want this
00:37:12.280 guy all right so I'm going to uh show
00:37:17.880 you the little example here so if you just clone real it comes with a couple oh God you can't see that at all awesome
00:37:25.599 let me move my window over all right everybody see that yeah there we go
00:37:31.200 seems good bigger I don't know all right so this is just uh the example uh
00:37:39.760 websockets example that comes with the real uh who went to Aon Patterson's talk
00:37:46.160 because he so this is really really similar to what he showed uh everything
00:37:52.040 he showed was kind of using core Ruby uh this is all kind of using celluloid
00:37:57.720 equivalence to cor Ruby stuff so uh yeah I mean maybe I should
00:38:04.359 show you what this does really Qui I'll just tell you so it does a uh it just shows you clock I mean that's all it
00:38:11.040 does it's like super trivial so basically what we have we have this time server and this is going to be our Event
00:38:18.280 Source so if you saw aon's example he was like trying to publish data to a
00:38:24.319 bunch of clients effectively and his data was the condition of his sausages
00:38:29.400 that he was curing or the humidity or whatever uh I actually didn't see his talk today I saw it before when he gave
00:38:36.040 it but so uh instead of publishing that I'm just going to publish the current
00:38:41.520 time so uh that a little run method there I should update that to the new
00:38:46.640 syntax that's the old syntax but basically when we start the server it kicks off this method asynchronously so
00:38:54.160 initialize returns immediately and then we just have this little run and it sleeps until it's synchronized to
00:39:00.960 the current second and then Celluloid is built in timers so we can say every one
00:39:07.400 and one is 1 second there uh we're going to publish that time change event to
00:39:13.079 everyone who's interested in it so if you're familiar with active support notifications uh cellul notifications is
00:39:20.560 practically the same API it's just multi-threaded so uh we can publish the
00:39:26.520 current time and then these time clients down here uh can subscribe to it right so
00:39:33.839 they're using that same API there uh so every time a client connects what it's going to do is uh get this web socket
00:39:41.640 here and then it's going to subscribe to these time change events and whenever that uh whenever that time server fires
00:39:49.480 a event by publishing it uh it's going to invoke this notify time change method in all these client threads so very
00:39:56.640 similar to what was doing just with websockets right and then uh that websocket kind of
00:40:03.680 is sort of a duck type IO right so we can just write whatever we want to it
00:40:09.040 and it's just going to get handled completely transparently so in this case is it's going to call uh the time.
00:40:16.119 inspect and just write that string back there and the only thing we have to worry about is this real socket error
00:40:24.079 which basically says the client disconnected then down here is the actual web server
00:40:29.760 itself uh see subass real server so real is kind of like a web server toolkit
00:40:35.200 right like it isn't exactly like unicorn or anything it just it's kind of more
00:40:40.280 like mongr it gives you this class that you can uh use to build your own web
00:40:45.359 server so uh we have this on connection method uh that's going to get called
00:40:51.319 every time somebody connects and then we have this WOW Loop right I mean this seems like a little bit confusing at
00:40:56.960 first but uh this is how it handles uh HTTP keep alive so every time you get a
00:41:04.040 connection you actually want to loop on that connection and wait for all the possible requests that are coming off
00:41:11.079 there because there may be more than one and what you're going to get back from connection. request is one of these
00:41:17.400 two classes either be a real request that's a standard HTTP request and then
00:41:23.560 there's real web socket so this is if the C client uh decided to do like an
00:41:29.000 upgrade upgrade web sockets right uh we get a different object back that uh will
00:41:35.800 do will be the web socket so we've got our R request you
00:41:40.839 know uh so there's actually a web machine will uh web machine is a ruby
00:41:47.920 web framework written by Sean cribs uh if you want a real router if you don't want to write your router like this
00:41:54.280 which is kind of you can use a web machine it has a real back back end but uh for this example we're just going
00:42:00.680 to kind of examine that URL and all we're going to do is serve a static page back and then if we got a websocket what
00:42:08.560 we're going to do is create this new time client class and handed that web
00:42:13.760 socket and here's just uh the boiler plate web page I'm going to show you and
00:42:19.079 finally what we're going to do is spin up that time server and the web server and finally we're going to sleep the
00:42:25.119 main thread uh sleeping main thread is kind of important because in Ruby if you don't do that it's just going to
00:42:31.359 exit so let me attempt to spin this up
00:42:39.640 here see we go so we got the log there so I'm GNA go to this address
00:42:48.520 let's see got a bunch of stuff going on there
00:42:54.839 all right so there is is that time server going over websockets there uh I
00:43:01.800 can spit multip multiple of these guys and they're all basically getting that
00:43:07.839 same signal from that same thread and just streaming it over this web socket
00:43:13.800 so hopefully that example there is kind of enough to get you going uh the state
00:43:19.319 of real documentation isn't great I do definitely want to improve
00:43:25.599 it all right let me try to get back to my slides
00:43:50.160 here all right there we go so finally I want to talk about
00:43:55.839 lattice uh so this is total vaporware but uh I want to build a web
00:44:03.839 framework of solid uh I don't have time to do it so i' love volunteers to like stop up and do this for me
00:44:11.839 uh um basical are recycle rails as much as possible Right like so rails
00:44:17.640 basically is everything good to go except uh you
00:44:22.960 can't do scatter gather requests to other API very well and that's the main
00:44:28.280 problem I want to solve so I don't want to like reinvent action view I don't want to reinvent active record I think
00:44:34.559 all those things are pretty good uh basically I want to fix action controller and give you seamless scatter
00:44:42.079 gather programming the multiple apis uh so as I mentioned earlier um
00:44:47.480 there is a web machine uh backend for real so the idea is you could have web
00:44:53.359 machine servicing uh your actual requests and building distraction on top of
00:44:59.040 that uh one of the big areas where I think grills kind of falls down so you don't have a multi-thread development
00:45:05.240 mode I mean in practice I don't think this is a huge problem because you're pretty unlikely to run into concurrency
00:45:12.280 bugs if you're just developing locally right doing a single request to your server that kind of thing uh where
00:45:18.680 you're going to see that Falling Down is when you have like you know tons and tons of requests coming into your server
00:45:24.119 concurrently that's when you're going to actually run these thread safety bugs but I think uh the closer you can get
00:45:31.520 your development environment to the real thing is always better if you can
00:45:36.559 right uh and really the killer app here is uh this easy scatter gather for
00:45:42.599 building service oriented architectures on top of Ruby so uh if you're interested in this idea if you would
00:45:50.440 like to uh maybe contribute to lattice uh hit me up afterwards I'd love to talk
00:45:56.280 to you and that's all I got so uh I am basul on
00:46:03.720 Twitter there uh kind of the entry point to all these cellula projects is cellula
00:46:09.400 doio so if you want to know the URLs for all this stuff just go there and everything's kind of linked from there
00:46:15.720 and I have a Blog on unlimited novelty. comom and that's it
00:46:30.800 so I have time for questions here seems so I've got one for you uh a couple
00:46:38.680 years ago I was really fascinated to read about rehea when you were uh doing
00:46:44.520 that kind of thing can can you comment a little bit on your evolution of thought from going to a different language
00:46:51.640 inventing a new language to now it looks like figuring out how to bring all this goodness into Ruby itself yeah yeah so
00:46:59.760 the question was about a programming language I created called rehea so I tried to build a ruby like language on
00:47:06.880 top of the erlang VM so that's kind of where I got some of the experience uh that I used to build
00:47:13.119 Celluloid uh basically what happened if you're familiar with the this guy Jos valim uh
00:47:20.520 one of the principal authors of rails um he created another language called
00:47:25.640 elixir and he did a much better job than I did so uh you know I had like Rey is
00:47:33.319 actually probably the open source project I've hacked on the most uh as like a few thousand commits I spent like
00:47:39.960 three years on it and what I learned is making a new programming language is really really hard uh it's not something
00:47:47.400 you should probably do as like your first successful open source project I don't think uh and I mean that that's
00:47:54.839 seriously what it was uh I mean I had mild success with like Coolio or whatever uh but really if you can tap
00:48:02.280 into like an existing language ecosystem and kind of give people tools and let
00:48:07.880 them use the language they already know instead of like well let's completely reinvent the entire universe and make a
00:48:13.559 new programming language I think that's like a lot more pragmatic approach uh Celluloid isn't perfect if I were to
00:48:20.559 make like a new I I'm still kind of interested in making like a new programming language uh but I don't have time to do
00:48:27.760 that and it's really really hard so I think Celluloid is just kind of a better
00:48:33.000 way to go yeah is there a answer to why
00:48:41.640 is uh so real does have a rack adapter uh the short answer for why I
00:48:48.680 think you shouldn't use it is rack middleware and fibers do not play nicely
00:48:54.960 together uh so really I mean the other reason I don't like rack uh real is
00:49:00.160 built with endend streaming so everything you do in real streams when you're reading the requested streams
00:49:05.319 when you're writing the requested streams there's no intermediate buffering and that buffering is a
00:49:11.960 essential part of the rack specification so if you go to read the rack spec uh you need rewindable input it's like
00:49:18.720 dictated in there at a very fundamental level so you have absolutely no if you're trying to do a request it's like
00:49:25.599 say a multi-gigabyte file you're going to end up writing that file to Temp even if your whole goal is like I want to
00:49:31.839 stream this to like say a video transcoder or something right like you have a way to like process this in a
00:49:36.960 streaming manner rack just doesn't let you do that so I hope they fix this in the rack 2 spec I you know this is on
00:49:44.640 their big laundry list of uh they need effects and rock to you but I do
00:49:49.880 there is rack adopter if you want to use Rock I wouldn't recommend
00:49:54.920 it yeah
00:50:05.880 uh to do what with iio sorry I missed
00:50:16.200 that b uh so what do you mean by batch I I
00:50:25.280 guess yeah yeah so definitely um so the easiest way to do that would be to use a
00:50:30.559 bunch of actors so you could use those pools for example right you could spin up a pool and just make synchronous
00:50:37.640 calls into it right and if you've exceeded that concurrency threshold right you're like well I only want to
00:50:43.799 make a maximum of I don't know like 50 outgoing HTTP requests at any given time
00:50:50.440 you can make a pool and you can give it size 50 and then every time you call into there if you've kind of exceeded
00:50:57.319 your uh concurrency limit there it'll just block until another worker is available so that is by far the easiest
00:51:04.359 way to use cellul Lo to do that kind of
00:51:09.920 thing yeah you and Del Cel example when you created and registered the actor on
00:51:16.559 the other node called it across is it possible to use a pool on that side
00:51:22.760 instead of a an instance yeah uh that's a little bit tricky here uh so if you
00:51:28.160 use uh cellul as this like supervision group feature uh if you use a supervision
00:51:34.559 group you can tell it uh pool as basically so you can give it a name and
00:51:39.599 there's also supervise as to supervise an actor so basically you can do pool ads uh that will register the entire
00:51:47.000 pool with that name and then you can call it just like any other actor so yeah supervision group is just a actor
00:51:53.559 basically yeah all right I think that's it
Explore all talks recorded at RubyConf 2012
+46