00:00:14.269
the title of this talk is boundaries this is the only one word talk title at this conference which I'm very proud of
00:00:20.189
the next shortest is three words thank you this is some of the stuff in this
00:00:28.199
talk is going to be very familiar to anyone who comes from certain functional programming backgrounds but this is a
00:00:33.809
story of me approaching some ideas that they have from a very different direction and from a very different history so I am Gary Bernhardt so I
00:00:42.270
would look like this on the internet where sometimes I get mad and my
00:00:48.510
Bluetooth is not working very well I might have to forego it I own a company
00:00:54.750
called destroy all software that produces screencasts on various advanced software development topics and to start
00:01:02.070
us off in this talk we're going to start with test doubles there are a couple talks about test doubles mocking and
00:01:08.159
stubbing at this conference this is not a talk about test doubles but they are going to be part of my motivation just
00:01:14.970
to make sure everyone's on the same page let's go through a quick example of what an isolated unit test might look like I
00:01:20.009
have a sweeper class this is in some kind of recurring billing situation and if I have a user who is subscribed but
00:01:26.100
has not paid in the last month I want to tell him that something's wrong and disable his access so when his
00:01:32.159
subscription when a subscription is expired we will make a user Bob he's going to be a stub he's an active user
00:01:38.280
and he last paid two months ago we will have an array of users that's just Bob
00:01:43.439
for convenience and before every test we're going to stub out the user doll method to return that array of Bob so
00:01:50.850
this is one of the ways in which we're isolating ourselves from third parties from other classes like user we want to
00:01:57.689
email the user when the subscription is expired so we will invoke the sweeper and we expect it to call user mailer
00:02:05.130
billing problem to send an email to this user telling him things are bad so this is an isolated unit test it's isolated
00:02:11.700
because it moves its dependencies like a user and like the user mailer hopefully my phone
00:02:18.690
is back now awesome okay the implementation of this is very
00:02:24.420
simple we will pull out all the users from the database we will select only the ones who are active users but have
00:02:30.659
not paid recently enough and then for each of those we will send the email right so very straightforward stuff what
00:02:38.340
we have here is a three class system these three classes integrate in
00:02:43.530
production but in tests where we're moving two of the dependencies replacing them with stubs and mocks giving us this
00:02:49.019
as our testing world so everything is nice and isolated there are several good
00:02:55.560
reasons to do this several very big benefits that come out of it but there's also one really terrible thing that
00:03:01.530
happens when you do this so let's go through those this allows you to do real test-driven design looking at your tests
00:03:07.500
seeing that you have mocked six things and two of them are mocked three method calls deep this tells you that your
00:03:13.140
design is not so good for this class so it gives you a form of feedback that you can't get without isolated tests at
00:03:18.630
least I don't know how to it allows you to do outside in TDD where you actually build the higher-level pieces before the
00:03:24.930
low-level pieces exist so we could TDD the sweeper using the user using the user mailer before those classes exist
00:03:31.349
because we're just stubbing them out anyway then when we want to write the user class for real we can look at what
00:03:36.510
we stubbed and that tells us the interfaces it needs and finally this gives you very fast tests this is one of
00:03:43.410
the main things in the whole fast rails tests me more I don't want to call it a movement but people getting excited
00:03:48.480
about fast tests in the rails world and we're talking about the difference between a 200 millisecond time from
00:03:54.239
hitting return to seeing the prompt back versus a 30-second time to run a very small test it's a very big difference
00:04:00.090
when you're when you're really isolating so these are all very good things that you want but they are balanced out by a
00:04:07.709
very bad thing and that bad thing is that in tests you're running against a mock and a stub and in production you're
00:04:14.010
running against real classes and if you don't stub the boundary correctly your tests will pass and your production system will be wrong and this is this is
00:04:22.770
such a big problem that for most people I think it overshadows all those benefits even if you explain them
00:04:28.420
to them they're going to look at this problem and say it's not worth it now
00:04:33.760
there have been attempts to fix this various various approaches to try to solve this problem in one way or another
00:04:39.750
one of which is to solve it with more testing contract and collaboration tests this is an idea sort of most closely
00:04:48.460
associated with JB rains Berger who is one of the people who is most influential on my understanding of isolated unit testing I've not actually
00:04:55.540
done this and something about it doesn't resonate well with me but it is one attempt to fix this there's also the
00:05:01.900
tools approach our speck fire is a tool and Ruby that tries to solve this problem if you mock a class in art with
00:05:07.660
r-spec fire it will make sure that you only mock methods it actually exists so and make sure that you don't cause
00:05:13.780
these boundary problems or at least you don't cause simple boundary problems and finally you can solve this with static
00:05:19.150
typing like so many things in life it comes with all the same costs you pay to solve anything with a powerful static
00:05:25.510
type system but if you think about your mocks as being subclasses of the real class they just remove all the actual
00:05:31.240
implementations that gives you an idea of how static typing can solve this boundary problem all of these only solve
00:05:39.610
simple kinesins mismatches between objects they solve things like I called the method with the wrong name like
00:05:44.800
passed the wrong number of arguments they don't solve deeper things like my two algorithms that need to cooperate
00:05:49.900
don't actually cooperate correctly the way that you can solve that and the most
00:05:55.570
common way people try to fix this problem is by just not doing isolated unit testing by just integrating right
00:06:01.740
the problem with solving the isolation problem with integration is that integration tests are a scam I can't try
00:06:10.900
take credit for the sentence this is once again JB rains burger there's a talk called integration tests or a scam
00:06:16.270
which you should all watch it's a really good talk that really lays out the argument for why integration testing doesn't work on a long enough time scale
00:06:22.630
and he nowadays uses a terminology integrated test I mean any test that's integrating multiple pieces I'll give
00:06:30.370
you the really quick and dirty argument for why integration tests don't work the number of paths through your program
00:06:36.010
goes like 2 to the n where n is the number of branches or conditionals and that includes try/except that includes a
00:06:42.340
short-circuiting boolean expression that includes a loop every time a branch is
00:06:47.470
happening if you have n of those you have 2 to the N paths and if you're trying to test the whole thing you have a space of 2 to the N to decide to
00:06:54.130
choose from if you have 500 conditionals in your program this is a number with about 150 digits in it it's a very large
00:07:00.880
space it's very difficult to effectively choose which which paths matter because they're effectively uncountable to you
00:07:06.720
the other problem is that suite runtime and an integration suite is super linear whenever you add a unit test or whatever
00:07:14.440
kind of test you're writing you're also adding a little bit of code so your number of tests goes up by one and you
00:07:19.870
make the system a little bit bigger which means all your existing integration tests get a little bit slower so every time you add a test
00:07:25.960
there are two sources of slowness one of which is linear and one of which is something else I'm not sure of but together it's definitely a super linear
00:07:32.680
runtime and anyone who has a three-hour rales test suite will be able to tell
00:07:38.410
you that this is in fact the case and they will probably not like their lives very much either so that's all
00:07:45.820
background this is this is how I came to the ideas of this that I'm going to talk about for the rest of this talk this was
00:07:51.450
this has been a large focus in my software development career for the last five five years is isolated testing and
00:07:58.030
figuring out how to do it well so now it's shift gears entirely talk about values values meaning the the pieces of
00:08:07.120
data inside of a program if you want to test the plus method and let's just
00:08:12.610
think about + on machine integers and for whatever reason you decide you want to test it in isolation so you don't
00:08:18.910
want any other dependencies involved in the testing what do you have to do to isolate plus nothing it isolates for
00:08:27.640
free plus doesn't have any dependencies there's nothing to mock out there's nothing to stub it's totally local and
00:08:33.750
why is that the case it's not just because + whoops it's not just because plus is simple it's it's tempting to say
00:08:41.349
oh plus is simple so of course it isolates for free that is not what's happening it has two properties that are necessary to be naturally isolated with
00:08:49.030
no stubs or mocks the first is that it takes values arguments and it returns new values and it doesn't mutate those values it just
00:08:55.510
gives you a new value right it takes an integer in an integer and it gives you an integer the second property
00:09:00.760
is that it doesn't have any dependencies there's nothing to mock it doesn't it doesn't need anything else it's a local
00:09:06.850
computation that just produces a new value so how could we apply that to more
00:09:11.950
complex code that we work with all the time stuff like the sweeper well let's go through this and just impose both of
00:09:18.370
these constraints and see what happens starting with the Bob stub we can't use a stub because we're not faking out any
00:09:24.340
boundaries so let's replace that with a user object but not like an active record object but like just a struct
00:09:30.220
some kind of a piece of data even a hash I wouldn't use a hash but you could just use a hash we can't do the the user dot
00:09:38.470
all stub because we're not allowed to so we'll just delete that and then the actual body of the test instead of doing
00:09:43.630
a mock expectation we can just call the method and get back the array of users who are expired now this does less than
00:09:51.730
the original code we're going to get to that later the the implementation changes we basically lose the second
00:09:58.930
half we now have a method that goes through all the users and returns only the expired ones this difference is huge
00:10:08.320
the difference between the original code and us is huge the the nature of the communication between the components has changed instead of having synchronous
00:10:15.790
method calls as the boundaries between things we now have values as boundaries the value returned or taken by the
00:10:22.570
method is the boundary between it and another object now just as a quick
00:10:28.630
digression when I talk about values I often mean things like this may be a
00:10:33.790
class that is a struct it has two fields title and body and it has a slug computed from the title but behaviorally
00:10:40.660
this is equivalent to a class that has a title body and slug and computes the slug at creation time they're basically
00:10:46.090
the same thing right the only way to tell the difference from the outside is timing properties on the method calls so
00:10:51.730
I'm going to use these two ideas interchangeably but really they're basically the same
00:10:57.060
so we've seen isolated testing as a bit of background the idea of converting the
00:11:03.040
code in the system to communicate via values at the boundaries instead of via message sends or method calls at
00:11:09.070
boundaries and now I want to look at how this fits into the three dominant
00:11:14.890
programming paradigms putting aside logic programming but how does this relate to procedural low and
00:11:20.710
functional programming here's a small piece of procedural code we want to feed
00:11:25.930
some walruses so for each of the walruses we shovel some food into its stomach we shovel some cheese into
00:11:31.300
walruses stomach there are two properties of this code that make it very obvious that it's procedural the
00:11:37.900
first is the each whenever you see each and Ruby there's something destructive going on each each with it with a non
00:11:43.960
destructive body is a no op so there's something destructive happening and we know the structure of the walrus and the
00:11:50.200
structure of its stomach we know it has a stomach we know the stomach can have things shoveled into it we have knowledge of the internals contrast this
00:11:57.820
with the oo solution where you still have in each it's still destructive in most oo code but now we tell the walrus
00:12:03.880
to eat something he knows how to eat instead of us knowing about his stomach and then the eat method will shovel
00:12:09.190
things into the stomach same code as before just encapsulated and my
00:12:14.470
Bluetooth is dying again so we have two two paradigms here both of them involve
00:12:21.040
mutation one of them separates data and code that's procedural one of them combines them into units called objects if we add functional to this instead of
00:12:30.220
doing in each we do a map we're going to take all the walruses and produce new walruses that are slightly different so for each of them we're going to call eat
00:12:36.820
on the walrus and some food some cheese and I'm going to use a hash for the walrus and array for the stomach and
00:12:42.520
strings for the food so in the eat function it's kind of weird but we build a new stomach that's the old stomach
00:12:47.920
plus the new food and then we build we build a new wall wrist that's the old
00:12:54.370
walrus with the new stomach you can see why oh oh models real-world things a little better than functional
00:13:00.130
programming does okay so that's that's functional nothing is being mutated
00:13:06.550
right so we have no mutation but data and code are separate they are not
00:13:11.980
combined into single things now if you look at this table obviously I've left a row there's one more row to go
00:13:17.530
but even just looking at the variables we have two variables does it mutate or not does it bind data and code together
00:13:22.750
or not they clearly vary independently which means we have four possibilities so what is the fourth possibility it's
00:13:29.050
not logic programming by the way here's what the four spot fourth possibility looks like we map like in functional
00:13:36.790
programming so we're producing new walrus's but we're telling the walrus to eat something and that's not a
00:13:42.460
destructive eat instead they eat method constructs a new walrus that is the old walrus with a new stomach that contains
00:13:49.030
the new food so it combines the immutability of the functional code but
00:13:54.340
it combines the merging of data and code together like oh oh does and that is the
00:14:00.070
fourth entry and I call it lovingly pho because it's not real low now there's a
00:14:08.590
problem with programming this way and that problem is that you lose the ability to do anything destructive to
00:14:14.020
talk to network to talk to disk to do any kind of i/o you lose the ability to maintain state over time so to to
00:14:22.690
reintroduce the idea of state we have to add imperative programming back into
00:14:27.790
this sort of photo style of programming we have to figure out how to compose the user database the expired users class
00:14:34.270
and the mailer together even though the expired users class is functional in nature so we have our expired users it
00:14:42.160
returns an array of users who we need to notify and what we need to do is reintroduce the imperative layer around
00:14:47.950
it an imperative shell which surrounds the functional core it talks to the database it uses expired users to filter
00:14:54.760
those users and then it emailed each of the ones that comes out so the
00:14:59.800
imperative shell is a layer that surrounds the functional core the functional core is the bulk of the application it has all the intelligence
00:15:06.190
and the imperative shell is sort of a glue layer between the functional pieces of the system and the nasty external
00:15:12.430
world of disks and networks and other things that fail and are slow if we if
00:15:18.580
we look at what's actually happening in these two things it's not an arbitrary distinction even though all I did was
00:15:23.710
cut the original method in half this division runs very deep if you look at
00:15:28.780
what these things do the expired users class makes all the decisions and the sweeper class has all
00:15:35.570
the dependencies so if we look at the
00:15:40.610
way that that relates to testing the functional core is heavy on paths heavy on decisions light on dependencies which
00:15:46.339
is exactly what unit testing is good at especially isolated unit testing when you take away the need to stub out the
00:15:52.430
dependencies you can just focus on the logic and the tests become very simple and exactly the same thing is true for
00:15:57.649
the shell lots of dependencies few paths is exactly what an integration test is
00:16:02.690
good at it because it makes sure all the boundaries are lining up all the pieces are communicating correctly but you
00:16:07.880
don't have a lot of test cases which means you don't end up with a 30 minute or a 3 hour test suite just to get a
00:16:16.730
sense of what that integration test might look like since we already saw the unit test maybe I create two users in
00:16:21.980
the database actually create them in an actual database I invoke the sweeper I pull out all the mails that were
00:16:28.399
delivered by action mailer and I make sure that only Alice was mailed she's the only one who's expired he or she
00:16:33.709
paid two months ago Bob paid yesterday but I only have to write one of these
00:16:38.779
whereas I'm going to have to write a bunch of the isolated tests on the functional core so now we have a a
00:16:47.300
solution to the isolation problem in for most code in system because we can build
00:16:52.370
it all as functional pieces in this sort of flow style where there are still objects but they're not mutating and
00:16:57.740
they're just taking values in and out and we have a way to reintroduce the imperative part around it so we can
00:17:03.410
actually talk to the outside world and it turns out that this leads to all
00:17:09.049
kinds of amazing benefits not just the testing benefit not just the fact that functional code is easier to reason about over time but it even makes
00:17:16.579
certain types of concurrency much easier if we think about the actor model of concurrency which is the one that I have
00:17:22.699
the most faith in as something sort of approaching a general-purpose concurrency style or currency
00:17:27.980
programming method let me quickly explain it to you just in case
00:17:33.290
everyone's not familiar I'm going to do it with just threads and queues so we have a queue and this is going to be the
00:17:38.450
communication mechanism between two processes it is the inbox of process to process one is going to send
00:17:44.300
to it for process one I'm just going to fork off the thread that is going to infinitely loop reading from standard in
00:17:50.300
and pushing into the cue process to is going to infinitely loop reading from the cue and writing to standard out so
00:17:55.370
this is an echo program that's communicating through a cue where the cue is the inbox for process 2 if I just
00:18:02.930
run this at the shell and start typing things into it it's just going to print out whatever I sent in this is the the
00:18:10.130
simplest way I know to explain the actor model you have independent processes each of them has an inbox it is only
00:18:16.160
readable by that process and they communicate by sending messages to each other into each other's inboxes
00:18:22.990
the reason the way that this relates back to functional core imperative shell 2 fo o 2 the idea of having lots of
00:18:29.540
values is that every value in your system is a potential message a possible message between two processes every
00:18:36.740
value that is struct like and can be easily serialized can also be easily sent over the wire and this is a special
00:18:44.360
case of the value is the boundary between the components so if we rewrite our sweeper in a slightly different way
00:18:50.210
so we have a sweep method it calls expired users on user dot also pulls everything out of the data out of the database finds only the expired ones
00:18:57.200
and then for each of those emails this is the imperative shell that you're looking at right now the functional core
00:19:02.630
is the expired user's class it's going to do what it did before or the expired users method excuse me it's just going
00:19:08.450
to filter out expired users and then we have this very trivial notify a billing problem thing that just delegates to the
00:19:14.150
mailer let's translate this into the actor model for the first one I'm going
00:19:20.360
to make an actor that pulls everything out of the database and just sends them one by one into the expired users actor
00:19:25.640
and then dies if I didn't do die then this would loop infinitely the expired
00:19:31.610
user's actor is just going to pop a user off of its Inbox it's going to decide whether that user
00:19:37.880
is late and if it is late it's going to forward that user on to the mailer process and the mailer process is just
00:19:44.330
going to invoke the mailer so the imperative shell is sort of a bigger process it takes a little while to run
00:19:50.750
it fires off all these messages to the smaller processes and what we've just done is converted a program that could
00:19:56.930
only use one core into a program that can use three cores not on MRI but on other VMs we'vewe've
00:20:03.970
parallelized this by doing very little work because we had the values available to send over the wire oh I forgot to
00:20:12.140
actually translate that there's the new version it's the same thing as the old basically values in your system afford
00:20:19.250
shifting process boundaries but really in general values in your system afford shifting boundaries between anything
00:20:25.370
between a class arrangement between subsystem arrangement between the wave
00:20:31.190
you're building your program whether it's serial or parallel so this has
00:20:37.480
programming in this style has surprisingly deep effects on the things
00:20:42.920
you can do in the way that you can do them that was a lot of stuff so now I'm going to try to reset it in like three
00:20:49.430
minutes to make it all tie together in this style you design your program as a
00:20:56.360
core of independent functional pieces that take values and return values the imperative shell orchestrates the
00:21:01.520
relationships between those interfaces them to the network the disk other nasty
00:21:06.530
systems like that and maintain state for example I wrote a Twitter client in this style it's sort of a it's a terminal
00:21:13.640
program but it's interactive like vim would be so you hit J to go down to the next tweet the imperative shell sees the
00:21:20.120
J calls into the functional core to generate a new cursor position the new cursor is generated and returned and
00:21:25.550
then the imperative shell updates the instance variable holding the cursor to be the new cursor the functional core
00:21:30.800
built the new cursor and it was a purely functional operation the imperative shell just updates references to these
00:21:36.950
new objects as they're constructed what you get from this is easy testing
00:21:43.280
especially isolated you also get easy integration testing and the distinction between which one happens where is a lot
00:21:49.610
more obvious than it is if you just start throwing things against the wall and try to figure out what gets tested how later you get fast tests you don't
00:21:57.650
have to do any weird stuff to get fast tests so just inherently fast because they're functional and working on small pieces of code you have no call boundary
00:22:05.150
risks you don't have to Stuber mock you have easier concurrency at least in the
00:22:10.220
actor model and you have more fluid transition between concurrent and serial computation and that's all
00:22:16.580
just a special case of having higher code mobility in general moving code between components moving code between
00:22:22.460
processes so that is the end of the
00:22:27.470
actual talk once again I am Gary Bernhardt I run destroy all software which produces screencasts and if you
00:22:33.740
are a subscriber or want to become one it is not free but there is a screencast on destroy all software called
00:22:39.320
functional core imperative shell which is the first time I ever talked about this in public and the one that's coming
00:22:44.990
out two weeks from now is also about this topic expanded a little more and in
00:22:51.500
that screencast I give a much larger example that I can't really give here but I show you the Twitter client and how its arranged and how how the
00:22:58.190
different parts of the system are segregated in this way so with that
00:23:03.230
thank you guys very much for listening to me for half an hour
00:23:16.930
that actually went way faster than I expected so I would be happy to to take
00:23:24.050
comments or questions or yeah do you think there's any
00:23:29.740
useful distinctions besides the functional bit between like a ports and
00:23:35.860
adapters architecture right that's a wonderful question the question is about port the relationship to ports an
00:23:41.680
adapter is writer or hexagonal architecture or these kinds of things yeah so if you're building a large
00:23:50.890
system that's going to be 30,000 lines of code you don't want to have one functional core and one imperative shell if you ask a haskell programmer about
00:23:58.210
doing this they will tell you that that it just becomes a nightmare I think that the the ideal large system is actually
00:24:04.000
many smaller systems built out of this in a sort of way you you have the functional pieces you wrap them in a
00:24:10.270
layer of scar tissue to interface them to the nasty outside world and then you build a bunch of those that communicate
00:24:15.940
in destructive ways is it does that answer the question
00:24:22.390
sure there's no adapter in that explanation but it's sort of the
00:24:29.050
adapters are the stars exactly that's true I guess yeah does some extent the the the imperative shell is just an
00:24:34.540
adapter fair observation yeah over on the side
00:24:44.910
the question is how if I have how if I find sound success in using actors with
00:24:50.040
Ruby the answer is no I have so this
00:24:56.610
this Twitter client that I that I wrote to as I was figuring this out does use
00:25:01.680
the actor model but it's just threads and cues I just built a little actor library it's like 35 lines of code a
00:25:07.500
simple actor library is easy a more complex one I see diminishing returns if
00:25:13.500
your VM isn't built for it you can't spawn half a million processes in Ruby your machines just going to go up
00:25:19.170
explode into smoke so use our lang yeah
00:25:30.770
paradise bringing in other gems of libraries the same time like let's see a
00:25:37.100
traditional right
00:25:42.549
so the question is how suited with a rails app be to this style development the answer once again is no it's not
00:25:51.470
going to work very well you could I mean it depends on how large your rails app is the thing about a rails app is if
00:25:58.159
your rails app is a hundred thousand lines you don't have a rails app you have ninety five thousand lines of your
00:26:03.860
application and you have five thousand lines of rails glue code and probably what you've done is dumped those ninety
00:26:09.500
five thousand lines into models controllers and helpers and fail to actually design your system if you have
00:26:16.429
designed a system and treated rails as a small component of it that you want to mostly protect yourself from then you
00:26:22.549
might be able to do this but to be honest I've I've never even thought hard about how you would do that I guarantee
00:26:30.320
it's possible but but you're not going to transition your large rails app into this easily by would be like we do
00:26:37.970
letters a lot written software and
00:26:51.090
you if with the imperative shell wrapped
00:26:57.159
around the functional core you can do whatever you want out there right so you can use I mean like my Twitter my
00:27:02.559
Twitter client uses all tons of ghent well not tons it uses like six or eight gems normal gems that are just you know
00:27:10.210
work like anything else and they're they're imperative in nature as oo programs and Ruby programs tend to be
00:27:15.220
and I just put them out in the scar tissue layer and I let that be as big as it needs to be to reasonably allow me to
00:27:22.270
use it and then in the functional core it doesn't it doesn't have to see any of that stuff it is this is exactly this is
00:27:30.490
the difference between just thinking about that sort of photo style programming the functional oo style and then actually adding the imperative
00:27:38.110
shell the imperative shell is what allows you to build real software that actually does work in this way so when
00:27:47.470
you give the functional example you create Larson
00:27:54.750
and sort of a true functional standpoint right just returned data sure you feel about
00:28:02.020
the next level like you know there's nothing special on Wohlers that's stomach return data
00:28:11.550
and so you know like how do you feel like going next level
00:28:17.390
so I missed the last sentence going even more functional I guess
00:28:22.490
well I wouldn't consider changing from returning wall or sister returning stomachs as more fun no unless ain't
00:28:30.020
dancing right well if you look at the
00:28:42.230
code I used the word walrus but really there's nothing especially while we're see about the code you could replace
00:28:47.630
that with animal and it wouldn't know the difference right it just knows that there is a stomach key and there is inside of the stomach is an array of
00:28:54.260
various foods so it's not tied to the to the walrus nature of the walrus the the
00:29:04.460
user has lost right the class of the boys it's not well no I never mentioned
00:29:12.500
a walrus class I could have used the word animal it would have been the same thing right as long as it as long as it has a stomach that code will work on it
00:29:18.470
I just used walrus to make it more concrete give it like whether it's how you build up the wing just for 30 days
00:29:24.590
eg if values are the neighbors then like different values or boundaries
00:29:33.550
objects are data if they're if all if all the methods on an object are pure
00:29:39.560
functions then the object is data and it's indistinguishable from an object that's struct that has everything early
00:29:45.560
bound right late binding only matters in a in a system with mutation in it this
00:29:53.210
is why for example Haskell is lazy well Haskell is weird yeah I don't know
00:30:00.620
how else to say that I feel like I'm failing to understand some part of your question okay yeah
00:30:25.310
yeah so well if you go if we go back to the place where I actually did that where I merged way back here there it is
00:30:34.050
this was actually the functional example right if you look at the the functional oo example I just did Wallace knew which
00:30:41.730
is a little more natural there's not an easy way to say I want a new object with only this field changed because Ruby's
00:30:46.800
not designed for that but that is easier to build in than it would be to build in to replace all your core types the nice
00:30:55.440
thing about the Ruby core types is that the the really scary things have bangs
00:31:00.480
on them usually the mutation it's not true for like delete but but the names
00:31:05.850
are usually very obvious that they're mutating or they have a bang on them I've actually not found a problem maintaining maintaining functional data
00:31:14.640
structure manipulation code in Ruby your mileage may vary yeah in the back
00:31:30.010
despite your cultural important there's a whole bunch of these exactly
00:31:37.320
what they're doing it certainly could you have to spend
00:31:44.990
what I've found is that the the choice of which classes you have in the core is
00:31:50.400
extremely important the names of them and the way that the responsibilities are divided up so actually I could pull
00:31:56.400
up part of the Twitter client and show you guys a larger example let's see wait
00:32:05.190
where am i yep so for example the cursor cursor this is one of this is a piece of
00:32:12.390
the functional core it has state that includes the tweets in a list of tweets and then selection is the currently
00:32:18.510
selected tweet all right so this encapsulate all the behavior of the cursor actually why is my keyboard not
00:32:24.300
working part of some of this is gross like it's actually quite a large class this is one of the largest classes I've written since I started programming Ruby
00:32:30.780
it's almost a hundred lines but that's because it's really like a very small module dude laughs it's like a very
00:32:38.430
small module of functional code it's just sort of self-contained and then if we look at the actual imperative shell
00:32:44.940
this is the entire shell it's 153 lines that what that says you guys read that
00:32:50.210
there sorry about that so let's see where cursor dot something with tweets
00:32:59.070
no starting at index the shell is sometimes a little bit awkward here we
00:33:04.080
go so here is the cursor actually being manipulated when you hit J it just reassigns the current cursor in the
00:33:10.710
shell to the result of doing cursor down and if we look at cursor dot down all it does is construct a new cursor so the
00:33:18.630
fact that I chose cursor to be one of the boundaries in the functional core is very important if I had had if I had a
00:33:24.480
tweet list and then was maintaining a selection separate from that this would have been awful it's very important to
00:33:30.240
find those boundaries that make very small cohesive functional components but not too small I mean I showed you like
00:33:36.540
three line examples in the talk but that's because it's a talk really you want pieces larger than that but smaller
00:33:42.600
than a whole subsystem does that answer your question at all I was Chuck right yeah I can't see but I can hear
00:33:53.760
hidey-ho
00:34:00.820
yeah that's the hard part I mean that's always the hard part right but separating separating things that do
00:34:06.970
mutation from things that don't gives you a starting point and I it's the best starting point I've found it's not an absolute rule but if you start there as
00:34:13.750
opposed to some other arbitrary rule I found much better results or design
00:34:19.470
other questions there is no library the Twitter client
00:34:27.220
is not online because I stopped working on it because it turned out that Twitter Twitter's evilness is growing much like
00:34:32.380
test run time of an integration suite and I lost confidence that I should
00:34:38.170
build software that interacts with it sorry Twitter employees I assume there's something here yeah so nothing oh if
00:34:43.240
it's not sorry fair enough fair enough
00:34:51.060
at least it scales okay pretty much new
00:34:58.990
objects while you see because you like to start as well yeah I
00:35:04.680
mean the the Twitter clients doesn't really have many performance concerns I mean it does when it comes up it's
00:35:12.180
sorting through thousands of tweets it remembers everything all the way back and so it has to do a merge of like what it has versus what it sees from the API
00:35:18.390
but it's not doing anything really big in an MRI your life may not be
00:35:24.509
especially good if you're doing tons and tons of allocation if you're in the JVM it's much better right and if you're in
00:35:30.569
a VM that's designed to have constant object creation and destruction it's going to be even better than that a VM
00:35:36.900
design for functional programming I would guess that the Erlang VM would would handle this very well for example
00:35:42.180
because in Erlang you're constantly making small objects and letting them be freed so yes doing doing this on MRI if
00:35:50.880
you have performance concerns is probably going to be a little difficult but you can do certain types of caching
00:35:57.480
right if everything is a value in immutable you can always cache things because they don't change so there's
00:36:02.880
there are ways there ways to work around the the unfortunate nature of your VM I saw a hand back there yeah what's the
00:36:10.259
what's the biggest thing you've built using this style and do you have any concerns that as it gets
00:36:17.079
big the ability to organize those both
00:36:22.450
good questions what's the biggest thing I built and do I have concerns about scaling this into larger projects the biggest thing I built is the Twitter
00:36:28.210
client it's not that big it's about 600 lines and I would not be up here talking about this if that were why I thought
00:36:34.690
this is good the reason I think that this is good is that it it has it has
00:36:41.710
shades of both the actor model built into it the idea of functional pieces
00:36:47.890
that are communicating by passing values back and forth and it also is a lot like
00:36:53.589
the Haskell idea of using the i/o monad to encapsulate state which is a
00:36:58.750
wonderful idea that scales wonderfully up to about 500 lines of code and then everything falls apart right you look at a 20,000 line Haskell
00:37:05.890
program that does a lot of i/o and you're not going to like life that this is why I say I think that the larger
00:37:10.990
program is is smaller ones built in this way communicating via via channels external to the process but what I'm
00:37:19.180
really trying to do is merge merge this idea of actors merge this idea of the i/o a monad and bring them into the oo
00:37:25.780
world using our terminology right I didn't talk about monads I only talked
00:37:30.849
about actors at the end as an example I'm trying to rephrase that stuff in terminology that we use so that it seems
00:37:37.780
more directly accessible but to get back to your question about about larger
00:37:43.060
systems some of the largest most well some of the most reliable large systems in the world are written in Erlang and
00:37:49.119
probably probably most of the reliable large systems in the world are written in Erlang lots of lots and lots of nines
00:37:55.319
not not like Twitter's three 9s we're talking about like eight nines right and the fact that they can build large
00:38:02.140
systems that are that reliable using the actor model even not even knowing what those words mean tells you that there's
00:38:08.950
something there right so that was a long-winded answer to a very simple
00:38:14.260
question yeah very uh if there's an approach that you might recommend
00:38:19.320
one we're creating new rails app and let's say they were hitting a user model that was sub classing active record base
00:38:26.060
they're an approach that one might take try to expand it with the techniques you're talking about an isolate right
00:38:34.700
what sorry if you're building a new rails application and you're doing
00:38:40.290
things like you have a user the subclasses active record base how do you how do you go about doing this I haven't
00:38:47.310
gotten that far yet I have opinions about how you should be building that application but they don't involve this
00:38:52.380
that's a different talk called deconstructing the framework but yeah
00:38:58.320
it's not clear to me yeah give me give me a year or two others yeah deal with
00:39:05.640
the case where the extraction starts exactly be in the sweeper capacity
00:39:11.760
featured at all but they turned out the database is really fast that yep one of the reasons that my
00:39:20.819
talks tend to take half as long as when I practice is I forget to give all the qualifications like for example you
00:39:26.910
don't want to actually do that right you don't want to call use it all then filter in in Ruby the most of the
00:39:36.180
complexity of your application is not database clearing right I mean there's plenty of querying in a complex app but
00:39:41.999
but it is not the 50% of your application it's a fairly small percentage and I think that probably if
00:39:50.279
you're using a using Postgres or my sequel or sequel light that goes in the shell if you're using something like de
00:39:56.249
Tomic which is a database where everything is immutable that can go in the functional core it's just data
00:40:02.339
structures data economic is just data structures so it depends on the nature of your database and the more your
00:40:08.400
components are designed to work in this way the more can move into the core but it doesn't mean that pieces it doesn't
00:40:13.799
mean you can't do this if you have Postgres it just means Postgres has to be relegated to the scar tissue which i
00:40:20.069
think is fine 80% functional is a heck of a lot better than 0% you don't have to get to 99% yeah front row keeps you
00:40:34.940
what keeps me in Ruby inertia to a small
00:40:40.490
extent also I just don't like any of those languages I have this this problem
00:40:48.470
where I can't I can't not care about syntax I really like syntax and I've
00:40:55.069
written I wrote a lot of Lisp in college and I just never really enjoyed it that much Python and Ruby or what I like
00:41:02.000
syntactically this is why I want to go live on a cruise ship and write a new language how are we doing on time is
00:41:10.970
3:10 we can do a couple more guess yeah
00:41:19.109
ways to fix these problems contribute to
00:41:24.329
Rubinius for about a year until you know how it works and then for Karoubi language i'll tell you how to do it i
00:41:31.440
mean you want persistent core types all right you want to you want core types that are designed to be used in this way
00:41:37.650
and from that most of this will fall out pretty pretty naturally you probably
00:41:43.589
want actors and lightweight processes and you're going to have to build a user land scheduler but it's not that hard that's what our lang has and if you have
00:41:50.249
a user land scheduler with lightweight processes if you can fork ten thousand hundred thousand processes easily and
00:41:55.259
you have immutable core types you're most of the way towards doing this ninety-nine percent of the time or
00:42:00.690
ninety-five percent of the time yeah back right RT sort of ask the question
00:42:06.660
before is someone who's ah very well acquainted with Eagles Twitter guys I
00:42:13.440
would still encourage you to go execute this is an example this style
00:42:20.640
yeah the question is why won't you answer my question no that's legitimate
00:42:28.560
I do I do plan on putting this up eventually even though I kind of am NOT
00:42:33.990
happy about Twitter I just I struggle with the idea of encouraging people to write software that interacts with something I don't like versus
00:42:40.010
demonstrating something that I think is good so yeah also it's a little bit
00:42:45.780
embarrassing like the shell is not actually tested at all there are zero tests around it even though 250 lines long which I think will give people the
00:42:52.590
wrong idea I mean I have reasons that I did that but but they're very hard to articulate in like a readme to anyone
00:42:58.080
will actually read so I'm a little torn about encouraging bad things metal right
00:43:04.340
I think that this is a little bit more of a minute but since I'm interested in a lot of
00:43:10.780
your ideas here I just wonder if you look at array languages and their
00:43:16.090
approaches to concurrency right versus thinking about these are thread levels and I'm wondering because I have a
00:43:22.660
thinkable idea working for the long part from future
00:43:27.970
concurrency just wondering what your needs are right so the question is have
00:43:33.130
I looked at array languages and and dude believes that thinking explicitly about threads or I assume you mean processes
00:43:38.980
as well like any kind of explicit yeah thread of control think about those explicitly isn't is not the the right long-term thing
00:43:46.210
the first the first answer is no so that's easy I mean I'm familiar with like J and all those languages I don't
00:43:52.180
actually know any of them I've seen small snippets but I don't understand them the to the second part about
00:43:59.770
threads and arrays or threads and processes not being the right primitive I guess would be the word right then
00:44:05.290
right primitive to build on I'm not convinced that that's actually true I'm not I'm not convinced that they're the
00:44:11.260
wrong thing I assume the alternative you're thinking of is things like a parallel map right like like implicit
00:44:16.300
parallelism that if you're still writing
00:44:22.030
sequential programs to just have parallel pockets whereas in the actor model everything is inherently parallel
00:44:28.720
I mean if it's even remotely reasonably decomposed right as long as you don't have one process that's doing a ton of work so I'm not totally not convinced
00:44:35.980
that that the threads and processes are wrong well I'm convinced that threads are wrong if you're sharing the state but I'm not convinced that independent
00:44:42.460
threads of control independent processes of control are the wrong thing yeah we
00:44:49.210
thought about writing your twitter application using a more open protocol like
00:44:55.720
writing the Twitter app against a more open protocol like Oh status I guess I
00:45:01.790
could doesn't sound very interesting that's the problem I I already wrote it
00:45:08.180
once I don't want to write again maybe I'll put it on github and I'll accept pull requests that put it on a more open
00:45:15.050
protocol I think yeah that it is time thank you guys very much
00:45:53.290
you