A taxonomy of Ruby calls

A taxonomy of Ruby calls

Play on YouTube

#yjit-yet-another-ruby-jit

#metaprogramming

A taxonomy of Ruby calls

Alan Wu • April 18, 2025 • Matsuyama, Ehime, Japan • Talk

In the presentation titled "A Taxonomy of Ruby Calls" delivered by Alan Wu at RubyKaigi 2025, the speaker explores the intricate facets of method calls in Ruby programming, emphasizing the vast array of options available for declaring and invoking methods and blocks. The talk aims to classify different call-related features based on their implementation in the CRuby interpreter and YJIT, providing attendees with insights into Ruby's language semantics and practical development implications.

Key points discussed include:
- Introduction to Ruby Calls: Wu introduces the concept of calls in Ruby, likening them to abstract animals with unique behaviors, and discusses their prevalence in Ruby code, noting recent changes in call features over six years of Ruby releases.
- Categorization Framework: Wu offers a framework for categorizing calls based on argument-passing work. The framework moves beyond mere syntax to address the semantics of how calls function during execution.
- Argument Handling: The complexity involved in passing arguments is explored. Wu highlights that different features in methods may require different computational efforts and that recognizing these differences can aid in optimization.
- Implementation Concerns: The presentation blends discussion on implementation concerns (specific to CRuby) with semantics, showcasing how argument passing can be specialized to improve performance.
- Call Pair Examples: Wu provides practical examples contrasting call pairs that require additional work versus those that do not, facilitating understanding of Ruby's internal mechanics.
- Special Features Discussion: The talk delves into high-order effects such as passing keyword arguments, handling optional parameters, and how these functionalities influence memory management and performance during runtime.
- Final Thoughts: The speaker concludes with a broad acknowledgment that calls in Ruby require sophisticated handling and urges developers to consider devising their categorizations for a deeper understanding of Ruby’s complexities.

Overall, Wu's presentation serves as both a theoretical exploration and a practical guide aimed at enhancing understanding of Ruby method calls and optimizing their utilization in development practices.

A taxonomy of Ruby calls
Alan Wu • Matsuyama, Ehime, Japan • Talk

Date: April 18, 2025
Published: May 27, 2025
Announced: unknown

Calls are an essential feature of Ruby and there are an eye dazzling number of options available for declaring and calling methods and blocks. In this talk, I will explore interesting interactions between call related features and put forward a classification of them based on my experience with their implementation in the CRuby interpreter and YJIT.

https://rubykaigi.org/2025/presentations/alanwusx.html

RubyKaigi 2025

00:00:14.000 that's the title taxonomy of ruby calls uh first slide got to feed my ego so

00:00:23.359 this is me i'm just a guy mostly a guy i became a ruby committer in

00:00:30.760 2019 uh and i worked on wjet since day

00:00:36.040 zero and i have a little japanese skit here uh sorry if you don't speak

00:00:42.960 japanese don't worry you won't be missing anything important

00:01:16.720 so why do we why why do why have this talk um the the title and the slides are

00:01:24.479 uh in this slide are the only place i will mention the word texonomy since i

00:01:30.640 don't want to pretend that this is more serious than it is it's just uh a bunch of groupings of different calls i came

00:01:38.079 up with um i like the animal analogy because the calls are like abstract

00:01:44.840 animals with unique behaviors so welcome to the zoo i hope you

00:01:51.320 enjoy um since calls are everywhere in ruby code language changes usually end

00:01:58.399 up touching them and there's been at least one

00:02:03.680 change to call features in the past six years of ruby

00:02:09.920 releases um and i hope that these categories give you

00:02:15.400 a unique view into ruby semantics they may help you understand

00:02:21.760 language changes or show you corners of the language you didn't know about for

00:02:27.680 practical development purposes you can use these to come up with uh test cases

00:02:35.000 um and both the process of making categories and the categorizations that

00:02:41.280 uh come up in the end will help with finding bugs so my categorization is based on

00:02:48.319 the nature of the work required to pass arguments in the call um in summary go

00:02:56.720 past the parsing phase also go past the method definition

00:03:02.959 phase this is what happens when you're making the hop from caller to the call e

00:03:08.159 and i'll be assuming that we know both the caller and the call e so the

00:03:13.519 starting calling side and the side receiving the arguments um there's some

00:03:18.800 subtleties in finding where the call should land but that's for a different talk um argument

00:03:26.840 passing now obviously using different features at a call site will do

00:03:32.159 different kind of work but who you are calling also matters a lot in this foo

00:03:39.560 example the version of food that takes a rest parameter requires an allocation of

00:03:45.120 an array but the second one doesn't but the caller side is the same in both

00:03:50.239 cases now if we analyze the body of the method and see that it doesn't use a

00:03:57.120 parameter we could say that oh if you're not using it why pass it there's no work required for passing the argument also

00:04:04.319 depending on the usage you can work backwards to optimize the exact way you're passing the argument but these

00:04:11.040 are somewhat second order effects um and for most of this talk i'm just putting

00:04:17.280 the parameter list here it's incomplete ruby code but it will help us focus a

00:04:24.560 lot of this is c ruby specific so when i say something happens just know that the whole talk is a mix of implementation

00:04:31.919 concerns and language semantics discussions now to make a call pair you

00:04:39.360 would pick a feature you want on the caller side and on the call side like in the sandwich shop you can pick multiple

00:04:47.040 features from each side so you have a combinatoric explosion and do the exact

00:04:53.040 counting exercise to find the exact number of combinations possible but it's not quite the power set since the some

00:05:00.560 features are mutually exclusive making this table was a good exercise

00:05:06.960 but um i realized that just like picking combinations off of it wouldn't be super

00:05:14.360 engaging but luckily this isn't the full story uh the features you use are sort

00:05:21.600 of the syntactic form of the call pair but at runtime depending on the nature

00:05:27.280 of the things you're passing a lot of things can happen um so many features

00:05:33.919 work differently depending on the actual actual arguments you pass i picked splat soar arcs here um because there's a lot

00:05:43.120 in there when args isn't an array there's a whole conversion protocol that happens that sometimes involves calling

00:05:50.240 into a separate method so it's like you get two for one uh when you call method

00:05:56.639 also what um what's inside of args also matters because sometimes you need to

00:06:02.479 check the last element in the array you're splatting just in case it's a

00:06:08.479 keyword ruby2 keywords hash to turn into keyword arguments um we care about exactly what

00:06:15.120 happens in these data variant features because this is a way to find

00:06:21.520 optimization opportunities for example you can optimize

00:06:28.160 um by specializing argument passing for each known call pair so this is some

00:06:34.880 excerpt of argument passing code having to do with splat don't worry too much

00:06:40.560 about actually parsing and understanding the code the point is just that it's doing a lot of check in there about okay

00:06:48.160 what is this caller using in terms of features what is this call doing in terms of features uh in the parameter

00:06:54.800 list if we specialize based on one specific call pair you don't have to do

00:06:59.840 any of these check anymore or you can think of it as bundling up all the check into

00:07:05.160 one um and if you do this reduction the code gets simplified into something like

00:07:12.280 this calls are fairly static in ruby programs because it's what you write and

00:07:19.199 despite all the meta programming features and dynamic features of ruby

00:07:24.479 uh most of the time the code stays static throughout your runtime and this

00:07:30.639 transformation removes the static parts leaving only the runtime checks on the

00:07:36.800 inputs in this case it's checking the contents of the splat array

00:07:42.120 the reduction sort of leaves behind the essential work for argument passing for

00:07:47.680 the specific thing you specialized on ruby 2 keywords checks are left over

00:07:53.759 and that shows that every splat call has to check the end of the array just in

00:07:59.680 case there's a keyword hash in there ideally when you don't use a feature you

00:08:04.879 don't pay the speed penalty for it but that's not the case for ruby 2 keywords

00:08:10.479 even if you don't have any mention of ruby 2 keywords anywhere in the entire ruby code that you're running we still

00:08:18.240 have to do this check but anyways you need a pretty complex jit compiler to do this

00:08:23.960 transformation mechanically and this is not a jit talk this is just a way to tease out semantics out of the

00:08:31.159 implementation and i'm using it to come up with my categorizations uh by the way there's a mechanism in the

00:08:37.959 interpreter called fast path that use uh that basically use this it caches a

00:08:44.480 function pointer that does argument passing logic specialized for each call pair and the body of each of these

00:08:52.600 functions are more or less the output of one kind of these

00:08:58.360 reduction the rest of this talk is just categories i made up the first one is sort of the base case this does no work

00:09:05.120 for argument passing per se uh sasa designed the vm so that callers compute

00:09:11.519 arguments and put them on the postfix notation stack in the exact layout

00:09:17.760 required parameters want um and this happens before the actual call on the

00:09:24.160 caller side when the call happens the arguments are already in the right place

00:09:29.279 so there's no need to move them or copy them anywhere you can see that there's

00:09:34.640 some extra metadata on top of the arguments but one two three don't move

00:09:41.360 the meta data run uh turn temporary values on the caller side into local

00:09:48.480 variables on the call side without actually moving them i'm using simple

00:09:53.839 integers for this example but this works for all expressions you can come up with

00:10:00.000 the computer arguments this is a somewhat fundamental protocol the whole vm is designed

00:10:06.440 around these are the simplest kind of parameters you can have so it makes

00:10:12.080 sense to have them fast and efficient uh sometimes we have to move stuff though

00:10:18.880 um that means the postfix notation stack can be smaller after the call than

00:10:25.120 before the call there are quite a few features that can do this and i listed some examples up

00:10:31.600 there basically anything that requires discarding of arguments like

00:10:38.160 unpacking an empty array or gathering multiple arrays into one object and um

00:10:44.959 do this we always want to know the stack size after the call because you need to

00:10:51.200 do stack overflow checks but also there are cases where waj needs special

00:10:56.880 handling for shrinking computation in vm always use

00:11:02.279 values at the top of the postfix notation stack as input this is just how

00:11:08.640 vm operations are defined and y follows this uh while handling calls so once

00:11:16.079 it's done with an argument it wants to pop it off of the top of the stack and move on to other arguments which again

00:11:23.040 is at the top of the stack calls calls that pass a block

00:11:28.240 argument has it at the top of the stack so it's the first thing that wet has to

00:11:33.519 deal with now one way to handle is to predict where the block argument will go

00:11:41.279 pop of the stack move it into the right place you're done um and that's what happens on the left

00:11:48.880 but sometimes this won't work if the stack needs to shrink because the final

00:11:56.640 destination of the block argument is where another argument is so in this

00:12:03.240 case um when the stack shrinks too much on

00:12:08.399 the right 1 2 3 gets gathered into an array for the rest parameter that ends

00:12:13.920 up where one is on the right side um but the final location of block

00:12:21.920 arc is where three is so we can't move it as the first step because we still need three after we're done with block

00:12:28.519 args so to handle this properly waj needs to tuck away block arg while

00:12:34.639 processing other arguments um it needs some temporary storage in addition to the stack and or

00:12:43.519 it needs to move away from this uh typical model of always taking from the

00:12:49.519 top of the stack as input well actually neither of the cases

00:12:56.079 work actually moving up past the top of the stack is incorrect in all situations

00:13:01.360 and widget has a bug in there found while writing the slide um the gc doesn't mark

00:13:07.480 values above the top of the stack so if we y to the gc after the move the block

00:13:13.200 argument will be collected u maybe widget needs to move away from using the stack as scratch space during calls to

00:13:20.639 handle this properly next i have a category that's the the polar opposite

00:13:28.480 instead of in terms of complexity uh when you use splat keyword

00:13:33.839 splat and object block arguments the emp% they

00:13:38.959 rep uh they respect uh they each expect array hash and proc the classes

00:13:46.880 um when the argument isn't the right type we tried to convert it by calling a

00:13:51.920 conversion method there's some differences between the three conversion protocols um but

00:13:58.880 there's some some things are the same across them across the board when you pass nil with these features you end up

00:14:06.079 passing nothing no special status is actually relatively new block arguments treated

00:14:14.160 especially at first it was an error for keyword argument splat um and ruby 34

00:14:22.720 made it work for keyword splat and in the next ruby release it

00:14:28.639 should more work more reliably for array splat right now the pass nothing effect

00:14:35.040 for a single star nil relies on getting an empty array back from calling 2a on

00:14:42.360 nil so you can redefine 2a on no class to break it the next version won't have

00:14:50.880 this extension point because doesn't seem very useful i have

00:14:56.160 a trick maybe always discard no without calling anything now when we do call the

00:15:04.000 conversion method that can fail the method might not exist it might retain

00:15:09.040 the rank type usually we end up raising argument error but array splat is special when

00:15:16.480 the conversion fails you pass the argument as if you

00:15:22.000 didn't do any splatting at all it's like if you edited the source code and you just remove that star so and this effect

00:15:30.560 falls out of um two steps first it tries to it tries to do the conversion and it

00:15:36.959 fails and to handle that error it wraps the fail to convert object in a single

00:15:44.720 element array and then like usual splat

00:15:50.000 situations it unpacks the array so you're wrapping the array and unpacking it you end up with what you started with

00:15:56.560 in the beginning somewhat of an odd feature but maybe it's not fully

00:16:02.079 intended for use it's just a side effect of being consistent with the capital a array method on kernel which also does

00:16:11.120 this when the conversion fails it wraps in one element array now on to block

00:16:19.079 arguments and refinements um the n% symbol shortorthhand is pretty well

00:16:24.560 known it's useful when you need a block that takes one argument and just calls a

00:16:31.279 method on the argument instead of writing out the block you can pass the name of the method that you want to call

00:16:38.320 using emp% instead i'm using down case in this example and two lines give the

00:16:44.720 same result as you can see great i used to explain block

00:16:51.360 argument conversion as oh it's just calling two proc and using the proc from that and you can call procs and you know

00:16:58.000 it's all great um so for symbols it could be we could redefine the two proc method

00:17:05.679 on symbol as something like this it's returning a lambda that takes one parameter and calls the method with the

00:17:13.120 same name as the symbol now my symbol two proc implementation

00:17:19.439 does work and if you plug it in for this example you get the exact same result as

00:17:25.039 you didn't override the definition but what if you have

00:17:31.280 refinements uh we're in a scope that redefineses kernel itself inside the

00:17:37.840 refinement scope um and this the the usual

00:17:44.320 definition of itself just returns the receiver so the output is identical to the input but in this redefined

00:17:50.960 definition we get one one instead because i have itself returning one inside the refinement so scope um just

00:17:59.440 like the down case example the symbol block argument shorthand behaves exactly

00:18:04.640 the same as we as if we wrote out the block by hand

00:18:10.880 it maintains this parity

00:18:15.960 the but um if you use the reimplementation of symbol that's going

00:18:23.280 to break it because the public send is in a different lexical scope so it won't

00:18:31.200 see that it needs to use the refinement down there um and it might actually be impossible

00:18:39.200 to reimplement symbol two in ruby because there's no way to get lexical

00:18:45.280 scopes as objects in the language now here's some more evidence

00:18:50.480 that symbol block arguments do more than just call tw um we can call twoprock on a symbol

00:18:57.520 and get an identity function proc it just returns whatever you pass it zero in this

00:19:05.320 case now let's have the same redefined definition of itself

00:19:13.400 again we can call the proc created outside of the refined scope inside the

00:19:20.320 refined scope and the refinement does have influence so we have one we get one

00:19:25.919 back um so far you can still explain this as a percent calling to proc right

00:19:32.400 um so it

00:19:39.880 seems yeah now let's try the opposite and call

00:19:46.799 two proc inside the refinement scope and then use it outside this also seems to

00:19:54.240 indicate that um it's just calling two proc and% is just calling two proc uh because outside

00:20:01.840 of the refinement scope there's no effect and so finally if if m% is just

00:20:10.400 calling two proc then we by explicitly calling two proc ourselves should behave

00:20:16.400 the same as if we use emp% so here i have i used emp% and captured the proc

00:20:23.600 object save it into a constant and then i used the proc outside of the refinement scope so

00:20:31.120 it should return the same result but obviously it won't return the same result if i i made the slide um so it's

00:20:39.919 doing something extra than just calling the two proc method is the point wow in

00:20:45.600 summary conversion procedures gets quite complex both when conversion succeeds

00:20:51.440 and fails um conversion happens relatively rare though and as we saw there are

00:20:58.559 situations where it doesn't call a method like in the the fail to convert

00:21:04.240 array case um passing a symbol as block argument

00:21:10.320 usually doesn't make a method call either unless you redefine symbol to

00:21:15.559 proc now i want to point out that just because there are fast ways to implement

00:21:21.600 things doesn't mean that the implementation is simple the complicated

00:21:27.600 parts of the semantics don't disappear and they sometimes manifest as memory

00:21:32.799 usage or bugs uh and i found a crash while making the

00:21:38.640 example program so you try to run it you might get a crash also i have to admit

00:21:45.039 that i'm being kind of lazy with the categorization here by lumping all these into just conversion really you can

00:21:51.840 split this down further into like array conversion keyword argument conversion

00:21:58.559 block pass conversion uh i'm running away the work like when wid sees complex

00:22:05.520 situations that i can't generate good code for okay let's take a break and talk

00:22:13.200 about something simpler

00:22:21.559 miso and when you make a call with some arguments say 1 2 3 you would expect

00:22:27.600 that the caller gets 1 2 3 but this is not completely true in all cases

00:22:32.799 sometimes there are extra data you're passing the most basic example is when

00:22:38.080 you call a method without passing a block there's a piece of little data in there that says "oh i'm passing no

00:22:44.960 block." but this is exact isn't that isn't exactly object

00:22:50.480 um that the call has access to

00:22:55.559 um when the call has optional keyword parameters ruby needs to compute the

00:23:02.480 default value for all the keywords the caller isn't passing so for this example

00:23:08.720 we have six sets of calls to the default abc method to compute the default values

00:23:17.360 in general for n optional keywords there are two to the power of n ways to

00:23:22.559 compute the default values for each optional parameter you can

00:23:27.679 either pass it or not pass it describable with a bit so with enough

00:23:32.720 bits you can describe the status of all the parameters the magic number is 31 at the

00:23:39.280 moment but the point is a lot of the times the vm passes a hidden integer for

00:23:44.880 optionals if the call has too many optional keyword arguments uh keyword parameters

00:23:52.799 we use a hash as a fallback but in any case there's no way to directly get the

00:23:58.080 integer or the hash so it's hidden it's set up as part of the call before

00:24:04.480 running anything in the call e and the call then checks then refer to that to

00:24:10.640 figure out which default computation it needs to run another category of call data that

00:24:18.640 passes hidden data is for forable parameters this started to pass hidden

00:24:25.919 data in 34 to reduce gc allocations thanks to aaron tenderlov patterson he

00:24:33.039 coined the term forwardable and forwarding basically you can define the

00:24:38.480 parameter of a method that you are able to

00:24:44.360 forward but you don't have to use it but if you do use it then you're

00:24:50.600 forwarding um forwarding should preserve the argument so it should behave as if

00:24:58.640 you didn't have the intermediary in the middle and you just called the call directly like in the bottom

00:25:09.640 um different call pairs require different work to handle so intermediary

00:25:16.159 has each caller pass a description of the call site along with the actual

00:25:23.480 arguments internally this call site description is called call info and

00:25:29.080 includes things like the number of arguments the f the kind of features used usually we're done with this

00:25:36.960 information after the call but in this case the call info itself is an argument

00:25:42.400 so kind of met up um there's no way to get the call

00:25:47.760 information inside intermediary as a ruby object though so it's just hidden

00:25:53.279 data it's only used when you actually do the forwarding and by the way the top diagram has

00:26:02.039 intermediary looking like a bottleneck that the three calls has to fit through

00:26:07.520 and that's intentional um because forwardable forwarding calls

00:26:14.360 um are harder to cache than usual usually you can cache based on the call

00:26:20.400 pair but in this case you have an extra call info that you also need to that's

00:26:26.159 also relevant uh to decide whether you can use a fast path so it kind of blows up your cache

00:26:34.440 key and finally i have a category for

00:26:39.919 allocation and jeremy is talking about this in the main hall so maybe for this class we should just get up and go to

00:26:45.679 the other hall but anyways i'll try not to overlap too much with jeremy here i'll start fairly abstract and work my

00:26:52.480 work my way back to what c ruby currently does first of all when something allocates

00:26:59.799 um whether something allocates has always been up to the ruby runtime you're using ruby code works with ruby

00:27:07.600 objects but the code doesn't have explicit control over the inmemory representation and the lifetime of the

00:27:13.840 memory there's no way to immediately free a ruby object for

00:27:20.360 example um now in in this example i have a call

00:27:26.799 pair that involves a rest parameter but the body of food is a

00:27:33.960 mystery i want to know whether it's necessary to allocate an object for this

00:27:40.000 call pair um but it it is necessary

00:27:47.799 because the body of food can do anything including stashing away the object into

00:27:54.559 a global variable that makes it live forever so you have to allocate um now

00:28:01.760 if we know the body of fu then you can say that

00:28:07.640 well it only needs to service the first call and then we're done with the data

00:28:14.480 so the caller doesn't have to allocate object it just have make something that works for the first call and then it's

00:28:21.480 done but then in the final case it's passing arcs to a mystery

00:28:28.039 method now then again because we don't know what the mystery method can do it

00:28:33.200 can do anything and anything includes making the object live forever so again we need to

00:28:39.720 allocate um so yeah the there's some analysis you

00:28:45.200 can run to pick alternative memor man memory management strategy other than

00:28:51.120 allocating an object up front it requires a pretty complex analysis to figure out and because you can load ruby

00:28:58.480 code at runtime this analysis also needs to happen at runtime and also as we saw

00:29:03.840 the scope of the analysis matters depending on it you get different results

00:29:10.240 um takes quite a quite a bit of sophistication to get performance win out of this we try to do something for

00:29:17.880 zjet but anyways the well on paper there are many situations we don't have to

00:29:23.200 allocate with the gc we can't do it yet uh there's obvious situations where you

00:29:31.440 have to allocate uh where the parameter basically calls for an object like rest

00:29:36.960 keyword rest there's also some c method definition options you can use that

00:29:43.919 basically act the same but there's some surprising cases where allocates even though in the parameter it doesn't

00:29:50.399 really call for an object like the he hidden hash or integer we talked about

00:29:55.760 with uh optional keyword arguments well we reached the end we saw

00:30:02.399 with a specialized method of parameter passing logic to call pair and a

00:30:08.720 forward-looking part about avoiding allocation there's also of course all the categories i'm not any kind of

00:30:15.679 authority when it comes to these categorizations um so it's pretty arbitrary i just made them up also

00:30:24.919 um so i encourage you to come up with your own categorizations the exercises useful i

00:30:32.880 found two crashes while making them uh i hope you have fun at the zoo and thank

00:30:40.399 you

explore all talks recorded at RubyKaigi 2025

Explore all talks recorded at RubyKaigi 2025

RubyKaigi 2025