00:00:14.000
that's the title taxonomy of ruby calls uh first slide got to feed my ego so
00:00:23.359
this is me i'm just a guy mostly a guy i became a ruby committer in
00:00:30.760
2019 uh and i worked on wjet since day
00:00:36.040
zero and i have a little japanese skit here uh sorry if you don't speak
00:00:42.960
japanese don't worry you won't be missing anything important
00:01:16.720
so why do we why why do why have this talk um the the title and the slides are
00:01:24.479
uh in this slide are the only place i will mention the word texonomy since i
00:01:30.640
don't want to pretend that this is more serious than it is it's just uh a bunch of groupings of different calls i came
00:01:38.079
up with um i like the animal analogy because the calls are like abstract
00:01:44.840
animals with unique behaviors so welcome to the zoo i hope you
00:01:51.320
enjoy um since calls are everywhere in ruby code language changes usually end
00:01:58.399
up touching them and there's been at least one
00:02:03.680
change to call features in the past six years of ruby
00:02:09.920
releases um and i hope that these categories give you
00:02:15.400
a unique view into ruby semantics they may help you understand
00:02:21.760
language changes or show you corners of the language you didn't know about for
00:02:27.680
practical development purposes you can use these to come up with uh test cases
00:02:35.000
um and both the process of making categories and the categorizations that
00:02:41.280
uh come up in the end will help with finding bugs so my categorization is based on
00:02:48.319
the nature of the work required to pass arguments in the call um in summary go
00:02:56.720
past the parsing phase also go past the method definition
00:03:02.959
phase this is what happens when you're making the hop from caller to the call e
00:03:08.159
and i'll be assuming that we know both the caller and the call e so the
00:03:13.519
starting calling side and the side receiving the arguments um there's some
00:03:18.800
subtleties in finding where the call should land but that's for a different talk um argument
00:03:26.840
passing now obviously using different features at a call site will do
00:03:32.159
different kind of work but who you are calling also matters a lot in this foo
00:03:39.560
example the version of food that takes a rest parameter requires an allocation of
00:03:45.120
an array but the second one doesn't but the caller side is the same in both
00:03:50.239
cases now if we analyze the body of the method and see that it doesn't use a
00:03:57.120
parameter we could say that oh if you're not using it why pass it there's no work required for passing the argument also
00:04:04.319
depending on the usage you can work backwards to optimize the exact way you're passing the argument but these
00:04:11.040
are somewhat second order effects um and for most of this talk i'm just putting
00:04:17.280
the parameter list here it's incomplete ruby code but it will help us focus a
00:04:24.560
lot of this is c ruby specific so when i say something happens just know that the whole talk is a mix of implementation
00:04:31.919
concerns and language semantics discussions now to make a call pair you
00:04:39.360
would pick a feature you want on the caller side and on the call side like in the sandwich shop you can pick multiple
00:04:47.040
features from each side so you have a combinatoric explosion and do the exact
00:04:53.040
counting exercise to find the exact number of combinations possible but it's not quite the power set since the some
00:05:00.560
features are mutually exclusive making this table was a good exercise
00:05:06.960
but um i realized that just like picking combinations off of it wouldn't be super
00:05:14.360
engaging but luckily this isn't the full story uh the features you use are sort
00:05:21.600
of the syntactic form of the call pair but at runtime depending on the nature
00:05:27.280
of the things you're passing a lot of things can happen um so many features
00:05:33.919
work differently depending on the actual actual arguments you pass i picked splat soar arcs here um because there's a lot
00:05:43.120
in there when args isn't an array there's a whole conversion protocol that happens that sometimes involves calling
00:05:50.240
into a separate method so it's like you get two for one uh when you call method
00:05:56.639
also what um what's inside of args also matters because sometimes you need to
00:06:02.479
check the last element in the array you're splatting just in case it's a
00:06:08.479
keyword ruby2 keywords hash to turn into keyword arguments um we care about exactly what
00:06:15.120
happens in these data variant features because this is a way to find
00:06:21.520
optimization opportunities for example you can optimize
00:06:28.160
um by specializing argument passing for each known call pair so this is some
00:06:34.880
excerpt of argument passing code having to do with splat don't worry too much
00:06:40.560
about actually parsing and understanding the code the point is just that it's doing a lot of check in there about okay
00:06:48.160
what is this caller using in terms of features what is this call doing in terms of features uh in the parameter
00:06:54.800
list if we specialize based on one specific call pair you don't have to do
00:06:59.840
any of these check anymore or you can think of it as bundling up all the check into
00:07:05.160
one um and if you do this reduction the code gets simplified into something like
00:07:12.280
this calls are fairly static in ruby programs because it's what you write and
00:07:19.199
despite all the meta programming features and dynamic features of ruby
00:07:24.479
uh most of the time the code stays static throughout your runtime and this
00:07:30.639
transformation removes the static parts leaving only the runtime checks on the
00:07:36.800
inputs in this case it's checking the contents of the splat array
00:07:42.120
the reduction sort of leaves behind the essential work for argument passing for
00:07:47.680
the specific thing you specialized on ruby 2 keywords checks are left over
00:07:53.759
and that shows that every splat call has to check the end of the array just in
00:07:59.680
case there's a keyword hash in there ideally when you don't use a feature you
00:08:04.879
don't pay the speed penalty for it but that's not the case for ruby 2 keywords
00:08:10.479
even if you don't have any mention of ruby 2 keywords anywhere in the entire ruby code that you're running we still
00:08:18.240
have to do this check but anyways you need a pretty complex jit compiler to do this
00:08:23.960
transformation mechanically and this is not a jit talk this is just a way to tease out semantics out of the
00:08:31.159
implementation and i'm using it to come up with my categorizations uh by the way there's a mechanism in the
00:08:37.959
interpreter called fast path that use uh that basically use this it caches a
00:08:44.480
function pointer that does argument passing logic specialized for each call pair and the body of each of these
00:08:52.600
functions are more or less the output of one kind of these
00:08:58.360
reduction the rest of this talk is just categories i made up the first one is sort of the base case this does no work
00:09:05.120
for argument passing per se uh sasa designed the vm so that callers compute
00:09:11.519
arguments and put them on the postfix notation stack in the exact layout
00:09:17.760
required parameters want um and this happens before the actual call on the
00:09:24.160
caller side when the call happens the arguments are already in the right place
00:09:29.279
so there's no need to move them or copy them anywhere you can see that there's
00:09:34.640
some extra metadata on top of the arguments but one two three don't move
00:09:41.360
the meta data run uh turn temporary values on the caller side into local
00:09:48.480
variables on the call side without actually moving them i'm using simple
00:09:53.839
integers for this example but this works for all expressions you can come up with
00:10:00.000
the computer arguments this is a somewhat fundamental protocol the whole vm is designed
00:10:06.440
around these are the simplest kind of parameters you can have so it makes
00:10:12.080
sense to have them fast and efficient uh sometimes we have to move stuff though
00:10:18.880
um that means the postfix notation stack can be smaller after the call than
00:10:25.120
before the call there are quite a few features that can do this and i listed some examples up
00:10:31.600
there basically anything that requires discarding of arguments like
00:10:38.160
unpacking an empty array or gathering multiple arrays into one object and um
00:10:44.959
do this we always want to know the stack size after the call because you need to
00:10:51.200
do stack overflow checks but also there are cases where waj needs special
00:10:56.880
handling for shrinking computation in vm always use
00:11:02.279
values at the top of the postfix notation stack as input this is just how
00:11:08.640
vm operations are defined and y follows this uh while handling calls so once
00:11:16.079
it's done with an argument it wants to pop it off of the top of the stack and move on to other arguments which again
00:11:23.040
is at the top of the stack calls calls that pass a block
00:11:28.240
argument has it at the top of the stack so it's the first thing that wet has to
00:11:33.519
deal with now one way to handle is to predict where the block argument will go
00:11:41.279
pop of the stack move it into the right place you're done um and that's what happens on the left
00:11:48.880
but sometimes this won't work if the stack needs to shrink because the final
00:11:56.640
destination of the block argument is where another argument is so in this
00:12:03.240
case um when the stack shrinks too much on
00:12:08.399
the right 1 2 3 gets gathered into an array for the rest parameter that ends
00:12:13.920
up where one is on the right side um but the final location of block
00:12:21.920
arc is where three is so we can't move it as the first step because we still need three after we're done with block
00:12:28.519
args so to handle this properly waj needs to tuck away block arg while
00:12:34.639
processing other arguments um it needs some temporary storage in addition to the stack and or
00:12:43.519
it needs to move away from this uh typical model of always taking from the
00:12:49.519
top of the stack as input well actually neither of the cases
00:12:56.079
work actually moving up past the top of the stack is incorrect in all situations
00:13:01.360
and widget has a bug in there found while writing the slide um the gc doesn't mark
00:13:07.480
values above the top of the stack so if we y to the gc after the move the block
00:13:13.200
argument will be collected u maybe widget needs to move away from using the stack as scratch space during calls to
00:13:20.639
handle this properly next i have a category that's the the polar opposite
00:13:28.480
instead of in terms of complexity uh when you use splat keyword
00:13:33.839
splat and object block arguments the emp% they
00:13:38.959
rep uh they respect uh they each expect array hash and proc the classes
00:13:46.880
um when the argument isn't the right type we tried to convert it by calling a
00:13:51.920
conversion method there's some differences between the three conversion protocols um but
00:13:58.880
there's some some things are the same across them across the board when you pass nil with these features you end up
00:14:06.079
passing nothing no special status is actually relatively new block arguments treated
00:14:14.160
especially at first it was an error for keyword argument splat um and ruby 34
00:14:22.720
made it work for keyword splat and in the next ruby release it
00:14:28.639
should more work more reliably for array splat right now the pass nothing effect
00:14:35.040
for a single star nil relies on getting an empty array back from calling 2a on
00:14:42.360
nil so you can redefine 2a on no class to break it the next version won't have
00:14:50.880
this extension point because doesn't seem very useful i have
00:14:56.160
a trick maybe always discard no without calling anything now when we do call the
00:15:04.000
conversion method that can fail the method might not exist it might retain
00:15:09.040
the rank type usually we end up raising argument error but array splat is special when
00:15:16.480
the conversion fails you pass the argument as if you
00:15:22.000
didn't do any splatting at all it's like if you edited the source code and you just remove that star so and this effect
00:15:30.560
falls out of um two steps first it tries to it tries to do the conversion and it
00:15:36.959
fails and to handle that error it wraps the fail to convert object in a single
00:15:44.720
element array and then like usual splat
00:15:50.000
situations it unpacks the array so you're wrapping the array and unpacking it you end up with what you started with
00:15:56.560
in the beginning somewhat of an odd feature but maybe it's not fully
00:16:02.079
intended for use it's just a side effect of being consistent with the capital a array method on kernel which also does
00:16:11.120
this when the conversion fails it wraps in one element array now on to block
00:16:19.079
arguments and refinements um the n% symbol shortorthhand is pretty well
00:16:24.560
known it's useful when you need a block that takes one argument and just calls a
00:16:31.279
method on the argument instead of writing out the block you can pass the name of the method that you want to call
00:16:38.320
using emp% instead i'm using down case in this example and two lines give the
00:16:44.720
same result as you can see great i used to explain block
00:16:51.360
argument conversion as oh it's just calling two proc and using the proc from that and you can call procs and you know
00:16:58.000
it's all great um so for symbols it could be we could redefine the two proc method
00:17:05.679
on symbol as something like this it's returning a lambda that takes one parameter and calls the method with the
00:17:13.120
same name as the symbol now my symbol two proc implementation
00:17:19.439
does work and if you plug it in for this example you get the exact same result as
00:17:25.039
you didn't override the definition but what if you have
00:17:31.280
refinements uh we're in a scope that redefineses kernel itself inside the
00:17:37.840
refinement scope um and this the the usual
00:17:44.320
definition of itself just returns the receiver so the output is identical to the input but in this redefined
00:17:50.960
definition we get one one instead because i have itself returning one inside the refinement so scope um just
00:17:59.440
like the down case example the symbol block argument shorthand behaves exactly
00:18:04.640
the same as we as if we wrote out the block by hand
00:18:10.880
it maintains this parity
00:18:15.960
the but um if you use the reimplementation of symbol that's going
00:18:23.280
to break it because the public send is in a different lexical scope so it won't
00:18:31.200
see that it needs to use the refinement down there um and it might actually be impossible
00:18:39.200
to reimplement symbol two in ruby because there's no way to get lexical
00:18:45.280
scopes as objects in the language now here's some more evidence
00:18:50.480
that symbol block arguments do more than just call tw um we can call twoprock on a symbol
00:18:57.520
and get an identity function proc it just returns whatever you pass it zero in this
00:19:05.320
case now let's have the same redefined definition of itself
00:19:13.400
again we can call the proc created outside of the refined scope inside the
00:19:20.320
refined scope and the refinement does have influence so we have one we get one
00:19:25.919
back um so far you can still explain this as a percent calling to proc right
00:19:32.400
um so it
00:19:39.880
seems yeah now let's try the opposite and call
00:19:46.799
two proc inside the refinement scope and then use it outside this also seems to
00:19:54.240
indicate that um it's just calling two proc and% is just calling two proc uh because outside
00:20:01.840
of the refinement scope there's no effect and so finally if if m% is just
00:20:10.400
calling two proc then we by explicitly calling two proc ourselves should behave
00:20:16.400
the same as if we use emp% so here i have i used emp% and captured the proc
00:20:23.600
object save it into a constant and then i used the proc outside of the refinement scope so
00:20:31.120
it should return the same result but obviously it won't return the same result if i i made the slide um so it's
00:20:39.919
doing something extra than just calling the two proc method is the point wow in
00:20:45.600
summary conversion procedures gets quite complex both when conversion succeeds
00:20:51.440
and fails um conversion happens relatively rare though and as we saw there are
00:20:58.559
situations where it doesn't call a method like in the the fail to convert
00:21:04.240
array case um passing a symbol as block argument
00:21:10.320
usually doesn't make a method call either unless you redefine symbol to
00:21:15.559
proc now i want to point out that just because there are fast ways to implement
00:21:21.600
things doesn't mean that the implementation is simple the complicated
00:21:27.600
parts of the semantics don't disappear and they sometimes manifest as memory
00:21:32.799
usage or bugs uh and i found a crash while making the
00:21:38.640
example program so you try to run it you might get a crash also i have to admit
00:21:45.039
that i'm being kind of lazy with the categorization here by lumping all these into just conversion really you can
00:21:51.840
split this down further into like array conversion keyword argument conversion
00:21:58.559
block pass conversion uh i'm running away the work like when wid sees complex
00:22:05.520
situations that i can't generate good code for okay let's take a break and talk
00:22:13.200
about something simpler
00:22:21.559
miso and when you make a call with some arguments say 1 2 3 you would expect
00:22:27.600
that the caller gets 1 2 3 but this is not completely true in all cases
00:22:32.799
sometimes there are extra data you're passing the most basic example is when
00:22:38.080
you call a method without passing a block there's a piece of little data in there that says "oh i'm passing no
00:22:44.960
block." but this is exact isn't that isn't exactly object
00:22:50.480
um that the call has access to
00:22:55.559
um when the call has optional keyword parameters ruby needs to compute the
00:23:02.480
default value for all the keywords the caller isn't passing so for this example
00:23:08.720
we have six sets of calls to the default abc method to compute the default values
00:23:17.360
in general for n optional keywords there are two to the power of n ways to
00:23:22.559
compute the default values for each optional parameter you can
00:23:27.679
either pass it or not pass it describable with a bit so with enough
00:23:32.720
bits you can describe the status of all the parameters the magic number is 31 at the
00:23:39.280
moment but the point is a lot of the times the vm passes a hidden integer for
00:23:44.880
optionals if the call has too many optional keyword arguments uh keyword parameters
00:23:52.799
we use a hash as a fallback but in any case there's no way to directly get the
00:23:58.080
integer or the hash so it's hidden it's set up as part of the call before
00:24:04.480
running anything in the call e and the call then checks then refer to that to
00:24:10.640
figure out which default computation it needs to run another category of call data that
00:24:18.640
passes hidden data is for forable parameters this started to pass hidden
00:24:25.919
data in 34 to reduce gc allocations thanks to aaron tenderlov patterson he
00:24:33.039
coined the term forwardable and forwarding basically you can define the
00:24:38.480
parameter of a method that you are able to
00:24:44.360
forward but you don't have to use it but if you do use it then you're
00:24:50.600
forwarding um forwarding should preserve the argument so it should behave as if
00:24:58.640
you didn't have the intermediary in the middle and you just called the call directly like in the bottom
00:25:09.640
um different call pairs require different work to handle so intermediary
00:25:16.159
has each caller pass a description of the call site along with the actual
00:25:23.480
arguments internally this call site description is called call info and
00:25:29.080
includes things like the number of arguments the f the kind of features used usually we're done with this
00:25:36.960
information after the call but in this case the call info itself is an argument
00:25:42.400
so kind of met up um there's no way to get the call
00:25:47.760
information inside intermediary as a ruby object though so it's just hidden
00:25:53.279
data it's only used when you actually do the forwarding and by the way the top diagram has
00:26:02.039
intermediary looking like a bottleneck that the three calls has to fit through
00:26:07.520
and that's intentional um because forwardable forwarding calls
00:26:14.360
um are harder to cache than usual usually you can cache based on the call
00:26:20.400
pair but in this case you have an extra call info that you also need to that's
00:26:26.159
also relevant uh to decide whether you can use a fast path so it kind of blows up your cache
00:26:34.440
key and finally i have a category for
00:26:39.919
allocation and jeremy is talking about this in the main hall so maybe for this class we should just get up and go to
00:26:45.679
the other hall but anyways i'll try not to overlap too much with jeremy here i'll start fairly abstract and work my
00:26:52.480
work my way back to what c ruby currently does first of all when something allocates
00:26:59.799
um whether something allocates has always been up to the ruby runtime you're using ruby code works with ruby
00:27:07.600
objects but the code doesn't have explicit control over the inmemory representation and the lifetime of the
00:27:13.840
memory there's no way to immediately free a ruby object for
00:27:20.360
example um now in in this example i have a call
00:27:26.799
pair that involves a rest parameter but the body of food is a
00:27:33.960
mystery i want to know whether it's necessary to allocate an object for this
00:27:40.000
call pair um but it it is necessary
00:27:47.799
because the body of food can do anything including stashing away the object into
00:27:54.559
a global variable that makes it live forever so you have to allocate um now
00:28:01.760
if we know the body of fu then you can say that
00:28:07.640
well it only needs to service the first call and then we're done with the data
00:28:14.480
so the caller doesn't have to allocate object it just have make something that works for the first call and then it's
00:28:21.480
done but then in the final case it's passing arcs to a mystery
00:28:28.039
method now then again because we don't know what the mystery method can do it
00:28:33.200
can do anything and anything includes making the object live forever so again we need to
00:28:39.720
allocate um so yeah the there's some analysis you
00:28:45.200
can run to pick alternative memor man memory management strategy other than
00:28:51.120
allocating an object up front it requires a pretty complex analysis to figure out and because you can load ruby
00:28:58.480
code at runtime this analysis also needs to happen at runtime and also as we saw
00:29:03.840
the scope of the analysis matters depending on it you get different results
00:29:10.240
um takes quite a quite a bit of sophistication to get performance win out of this we try to do something for
00:29:17.880
zjet but anyways the well on paper there are many situations we don't have to
00:29:23.200
allocate with the gc we can't do it yet uh there's obvious situations where you
00:29:31.440
have to allocate uh where the parameter basically calls for an object like rest
00:29:36.960
keyword rest there's also some c method definition options you can use that
00:29:43.919
basically act the same but there's some surprising cases where allocates even though in the parameter it doesn't
00:29:50.399
really call for an object like the he hidden hash or integer we talked about
00:29:55.760
with uh optional keyword arguments well we reached the end we saw
00:30:02.399
with a specialized method of parameter passing logic to call pair and a
00:30:08.720
forward-looking part about avoiding allocation there's also of course all the categories i'm not any kind of
00:30:15.679
authority when it comes to these categorizations um so it's pretty arbitrary i just made them up also
00:30:24.919
um so i encourage you to come up with your own categorizations the exercises useful i
00:30:32.880
found two crashes while making them uh i hope you have fun at the zoo and thank
00:30:40.399
you