Summarized using AI

Eliminating Unnecessary Implicit Allocations

Jeremy Evans • April 18, 2025 • Matsuyama, Ehime, Japan • Talk

In the presentation titled "Eliminating Unnecessary Implicit Allocations" by Jeremy Evans at RubyKaigi 2025, the speaker discusses new optimizations introduced in Ruby 3.4 aimed at reducing implicit object allocations during method calls. Following up on his previous presentation about reducing those allocations in Ruby 3, Evans outlines several allocation regressions that were identified during the development of these optimizations, as well as the measures taken to fix these issues and prevent future regressions through testing.

Key Points Discussed:

  • Allocation Regressions: Evans details three specific allocation regressions encountered during optimization efforts. Each regression illustrates instances where Ruby allocated unnecessary objects that did not occur in earlier versions, showcasing the iterative process of optimization and bug fixing.

    • The first regression involved positional splat keyword arguments that led to an unnecessary array allocation.
    • The second addressed a case where keyword splat functionality was mishandled, causing an erroneous hash allocation.
    • The third regression highlighted a call that allocated both a hash and an array unnecessarily.
  • Creation of Allocation Test Suite: To ensure that further allocation regressions are detected early, an allocation test suite was implemented, incorporating tests that check expected allocation behaviors for method calls. The test suite has effectively prevented further regressions related to allocations since its introduction.

  • Optimizations for Literal Arrays: The discussion extends to how literal arrays were optimized, illustrating that all literal arrays now only allocate a single array in Ruby 3.4. This included improvements in handling large literal arrays, which previously suffered from excessive allocations due to Ruby's internal management of array capacity.

  • Caller-Side Optimizations: Further optimizations were made by eliminating unnecessary array allocations for positional splats within method calls when certain conditions are met, thus enhancing performance without compromising Ruby's semantics.

  • Remaining Implicit Allocations: Evans concludes with an acknowledgment of the remaining implicit allocations that still exist in Ruby related to method calls, largely due to structural limitations within Ruby’s method calling architecture. He provides insight into potential future optimizations that could address these allocations but notes that significant changes would be required.

Overall, the presentation highlights the ongoing efforts to refine Ruby's memory management and improve performance by reducing unneeded allocations, illustrating that through rigorous testing and optimization, performance can be significantly enhanced.

Eliminating Unnecessary Implicit Allocations
Jeremy Evans • Matsuyama, Ehime, Japan • Talk

Date: April 18, 2025
Published: May 27, 2025
Announced: unknown

This is a followup to my RubyKaigi 2024 presentation, "Reducing Implicit Allocations During Method Calling", discussing an entirely new set of allocation reduction optimizations included in Ruby 3.4. This presentation will describe allocation regressions that occurred while developing these optimizations, and the allocation test suite added to prevent future regressions. It will also discuss other bugs that were found as a result of this optimization work, and how they were fixed. Finally, it will discuss the implicit allocations that remain, and why they would be challenging to address.

https://rubykaigi.org/2025/presentations/jeremyevans0.html

RubyKaigi 2025

00:00:01.520 okay
00:00:07.440 hello
00:00:14.040 everyone so in this presentation I'm going to be discussing code changes that we made in Ruby 34 to eliminate
00:00:21.119 unnecessary implicit allocations my name is Jeremy Evans i'm a Ruby committer who focuses on fixing
00:00:28.320 bugs in Ruby i work at Ubloud as a principal software engineer we're building an open- source alternative to
00:00:34.960 Amazon Web Services i'm going to start this presentation with a discussion on
00:00:40.040 regressions specifically allocation regressions so an allocation regression
00:00:45.600 is a case where Ruby unnecessarily allocates an object in a case where it
00:00:50.800 did not do so previously and while working on reducing implicit allocations
00:00:56.239 in the span of about two weeks I found and fixed three separate allocation regressions all of which I caused now
00:01:04.640 the first allocation regression involved this code which passes a positional splat keyword splat and block when I
00:01:12.159 started working on reducing implicit allocations this allocated one array and
00:01:17.280 I made changes to the optimizer that made this call allocationless and unfortunately I found there was an
00:01:23.200 evaluation order issue in this code if keyword was not a hash and block was not
00:01:28.240 a proc as the code would call block.2 proc before keyword.2 hash now I fixed
00:01:34.720 this before the release of Ruby3 so the code would call keyword.2#ash before
00:01:39.759 block.2 proc and unfortunately that broke the optimization so in Ruby3 this
00:01:45.520 call still allocated an unnecessary array now I fixed this allocation regression in Ruby 34 so this call is
00:01:52.720 once again allocationless the second allocation regression affected this code and so
00:01:58.719 here we have a method that accepts a keyword argument we're calling the method with a keyword splat this should
00:02:05.360 not require an allocation because we can just look into the keyword splat hash to
00:02:10.399 get the value indeed in Ruby 32 this call was allocationless now in Ruby3 I
00:02:17.360 fixed a bug that passed the keyword splat directly as a positional argument
00:02:22.879 if the method calling method or call did not accept keywords now unfortunately
00:02:28.400 that bug fix caused an allocation regression resulting in this code allocating a hash so I fixed this
00:02:35.040 regression in Ruby 34 so this type of call is once again
00:02:40.120 allocationless the third allocation regression involved this type of call so we have the same method as in the
00:02:46.319 previous example but this time we're calling it with a positional splat and static literal keywords now this
00:02:53.599 allocates a single hash in Ruby3 and when adding an optimization early in the Ruby 34 development cycle I
00:03:01.040 introduced an allocation regression and made this call allocate an array as well as a hash i was able to fix the
00:03:08.080 regression so the code no longer allocated an unnecessary array and I was also able to eliminate the unnecessary
00:03:13.840 hash allocation so this call is now allocationless in Ruby 34 as I mentioned
00:03:19.920 I found and fixed these three allocation regressions and within about 2 weeks and
00:03:25.760 finding multiple allocation regressions in so short a time frame was disheartening it indicated to me that
00:03:32.159 there was little point in attempting to eliminate unnecessary implicit allocations unless I could prevent
00:03:38.400 further allocation regressions so I thought about how to do that and the conclusion I came to is that allocation
00:03:45.120 regressions are no different in nature than any other regression and how do you fix regressions generally well you add
00:03:51.920 tests tests help you ensure expected behavior and if the expected behavior is
00:03:57.280 a certain call type allocates a certain number of objects you need to write a test for that and that way if you change
00:04:04.080 the code to fix a bug and it results in an allocation regression the related
00:04:09.120 test breaks alerting you to the problem immediately and allowing you to fix it
00:04:14.239 so I started writing an allocation test suite and as I have a limited time I'm not going to discuss the implementation
00:04:20.320 of the allocation test suite i'm going to go over a simplified example of an allocation test so in this allocation
00:04:26.639 test we define a method and then we call the check allocations method and the
00:04:31.759 third argument is the code that will be executed the first argument is the number of arrays the code should
00:04:38.000 allocate and the second argument is the number of hashes the code should allocate so this line tests that calling
00:04:44.960 a method that with a single positional parameter with a single positional argument should not allocate an array or
00:04:52.720 hash this line tests that calling the same method with a keyword splat does
00:04:58.080 not allocate an array but it does allocate one hash and that's expected
00:05:03.120 because the positional parameter should be passed a copy of the keyword splatash
00:05:08.639 now I think the allocation test suit I created was my most important contribution to Ruby3 it resulted in
00:05:14.639 three major benefits the first benefit is that since its introduction it has
00:05:19.919 prevented further allocation regressions at least for the cases that it tests the
00:05:25.600 second benefit was it allowed the Prism team to find all of the tested cases
00:05:30.639 where the Prism compiler was allocating more objects than the parse Y compiler and fix all of those cases before Prism
00:05:37.600 was made the default parser and that prevented allocation regressions from the parser switch
00:05:43.919 finally the test suite shows for each type of call how many implicit allocations the call was making i could
00:05:51.039 review the allocation numbers for a passing allocation test suite and find cases where Ruby was unnecessarily
00:05:57.840 allocating objects the allocation test suite basically provided me a list of call types where I could eliminate
00:06:04.160 unnecessary implicit allocations note that not all unnecessary implicit allocations were in
00:06:10.319 method calling i found that large literal arrays could allocate more than
00:06:15.360 one array for some background I'm going to give a brief overview of the instructions used to create literal
00:06:21.840 arrays in Ruby so the empty literal array allocates one array and this is
00:06:27.360 implemented with a single VM instruction new array zero if the array contains
00:06:32.400 only frozen literals such as numbers and symbols it also only allocates one array
00:06:38.160 in this case a frozen array is statically allocated for the literal and the dupe array instruction is used which
00:06:44.960 returns a copy of that array and this is true regardless of the size of the array
00:06:50.240 as long as all elements are frozen literals even if a literal array has a million elements it only allocates one
00:06:57.360 array because a single dupe array instruction is still used which returns a copy of that frozen array in Ruby 33
00:07:05.280 if the literal array contains nonfrozen literal or an expression the number of
00:07:10.479 array allocations is based on the number of elements in the array so here we have an array created with a single local
00:07:16.960 variable and this literal array will be created using two instructions the first instruction will retrieve the value of
00:07:23.199 the local variable and place it onto the VM stack the second instruction will allocate a new array with a single
00:07:29.680 element using the top element of the VM stack now here's a more complex case a
00:07:35.280 literal array with 257 elements all of which are local variable accesses and
00:07:41.440 this is a case that actually allocates more than one array in Ruby 33 it allocates three arrays so here are the
00:07:48.319 VM instructions used the first 256 instructions just retrieve the local
00:07:53.919 variable value and place it onto the VM stack and this creates then it creates one array with 256 elements so this is
00:08:01.599 the first array allocation this instruction is the same as the earlier instructions and just pushes the local
00:08:07.280 variable value onto the VM stack and this new array instruction allocates another array to wrap that element which
00:08:14.319 is the second array allocation and the final instruction is a concat array instruction which takes the two
00:08:19.919 temporary arrays combines them and returns a newly allocated array that's
00:08:25.120 the third array allocation i'm guessing a lot of you may be wondering why Ruby allocates a new array for every 256
00:08:33.399 elements certainly wasn't obvious for me when I first started looking at this and it turns out the reason Ruby does this
00:08:39.519 is to avoid VM stack overflow i mean the VM stack has a limited size and if you
00:08:44.800 try to define a large enough literal array you could eventually overflow the VM stack to avoid that Ruby chunks large
00:08:51.920 little arrays into subarrays of 256 elements and then concatenates them together which seems reasonable and that
00:09:00.000 being said while this array allocation seems necessary since a literal array must allocate at least one array these
00:09:07.279 two array allocations seem unnecessary instead of allocating new arrays we should be appending to the array that we
00:09:13.519 allocated in the new array 256 instruction and we can do that by using the push to array instruction discussed
00:09:19.600 in last year's Rubiki presentation and that change means that this literal array only allocates a single array in
00:09:27.440 Ruby 34 turns out it's not just large literal arrays that allocate more than one array
00:09:33.519 in Ruby3 literal arrays that contained positional splats also allocate more
00:09:39.760 than one array in Ruby3 for example list little array which contains a single splat allocates four
00:09:47.519 arrays in Ruby3 so here are the instructions used to create the literal array in Ruby 33
00:09:54.000 all four of these instructions allocate an array even though only one array allocation is needed switching those
00:10:00.880 final new array and concat or array instructions to push to array eliminates two array allocations and the final
00:10:07.279 unnecessary allocation is due to this concat array instruction which is used to implement the splat part of the
00:10:12.880 literal array and that's fixed by using the other new VM instruction that I discussed in last year's presentation
00:10:19.040 which is called concat 2 array and this instruction is similar to concat array but it does not allocate a new array and
00:10:25.920 with those two changes Ruby 34 now only allocates a single array when creating a
00:10:31.279 literal array with splats i think it was good that I was able to reuse VM
00:10:36.320 instructions that I added to speed up method calling to also speed up literal arrays however I ran into a case that
00:10:42.880 was not possible to optimize with existing instructions and this case involved using keyword splats in literal
00:10:49.720 arrays so here's an example we create a literal array using the splat of an
00:10:54.959 array and a keyword splat of a hash ruby 33 allocates three arrays for this case
00:11:01.519 uses the following simplified VM instructions to create the literal array so this set of instructions is for the
00:11:07.839 keyword splat and these three instructions each allocate an array this
00:11:13.519 array allocation is needed since literal array does need to allocate an array but
00:11:18.640 the array allocations for these instructions should be eliminated it's possible to save one array allocation by
00:11:26.160 changing this concat array instruction to concat 2 array which does not allocate a new array but then you would
00:11:32.480 not be able to eliminate the array allocated in the new array keyword splot instruction since concat array can only
00:11:38.560 combine arrays unfortunately it's not possible to replace both of these with a push to array instruction because that
00:11:45.360 will push an empty array or empty hash onto the array if the keyword splat is
00:11:50.480 empty now the simplest solution to avoid the unnecessary array allocation was to add a new VM instruction with the
00:11:56.880 expected semantics so that's what I did this instruction is named push to array keyword splat and it's similar to push
00:12:03.680 to array but it does not accept an argument so this instruction pops the VM stack and then it appends the popped
00:12:10.399 object to the array at the current top of the VM stack unless the popped object is an empty hash and with that change
00:12:17.839 literal arrays with keyword splats only allocate a single array in Ruby
00:12:23.320 34 related to this before I started working on these literal array optimizations we received a bug report
00:12:30.240 that Ruby's behavior in this case was actually incorrect if you were keyword splatting an empty hash and the keyword
00:12:37.040 splat was directly after a positional splat with no elements in between then
00:12:42.480 Ruby 33 and below would include an empty hash in the array i submitted a pull
00:12:47.760 request to fix this bug in a backwards compatible way before working on the literal array optimizations the push to
00:12:54.320 array keyword spot instruction optimization also fixed this bug and that was committed first but my original
00:12:59.839 pull request was backported to Ruby 32133 so now Ruby32 and above will no
00:13:05.279 longer include an empty hash for an empty keyword splat in a literal array
00:13:11.360 with those optimizations I believe all literal arrays now only allocate a single array in Ruby
00:13:18.519 34 so I'm now going to discuss how I'd eliminated caller side positional splat allocations
00:13:25.040 as I discussed last year I was able to eliminate unnecessary array allocations for the following method calls in Ruby3
00:13:32.480 by adding an optimization pass that changed splat array true to splatter array false unfortunately while the
00:13:39.600 approach of using the optimizer worked for these types of calls it did not work for other types of calls these calls all
00:13:47.680 allocated an unnecessary array for the positional splat all of these cases result in dynamic hash after the
00:13:54.880 positional splat which is not one of the cases that the optimizer can optimize because the instructions for creating
00:14:00.880 dynamic hashes are not fixed optimizing these cases requires knowledge of the
00:14:06.079 parse tree not just access to the unoptimized instructions and that meant that to optimize these cases I would
00:14:12.880 have to move the optimization from the optimizer to the compiler so I looked into how to optimize these cases in the
00:14:19.519 compiler and at the time the default parser was parse Y and the main function for compiling arguments for method calls
00:14:26.959 when using parse Y is named setup args core the fourth argument is whether you
00:14:32.560 should duplicate an array being splatted and this is what controls whether the method call allocates an array for the
00:14:39.519 splat my first thought to optimize these cases was to look for places where we're
00:14:44.880 passing nonzero for the dup rest argument and see if we can change it to pass zero while still complying with
00:14:51.680 Ruby semantics one case I found is this recursive call inside setup args core
00:14:58.480 this case handles when you're compiling an args push node so the args push node is used in Ruby when you have arguments
00:15:06.160 following a positional splat so these are all examples of calls that would use an args push node this type of method
00:15:13.279 call with a positional argument following a positional splat needs to
00:15:18.480 allocate an array because Ruby's internal method calling API does not handle positional arguments after a
00:15:25.199 positional splat so an array needs to be allocated for the positional splat and the argument following the array needs
00:15:31.920 to be appended to that array these four calls do not need to allocate
00:15:37.199 an array because there was only a positional splat and no positional arguments following the splat so these
00:15:44.160 two cases were previously optimized via the optimizer in Ruby33 so the compiler
00:15:50.000 would create instructions that allocated an array but the optimizer would recognize those instruction combinations
00:15:55.920 and change them so they did not allocate an array these are two cases that allocated an
00:16:01.759 unnecessary array in Ruby3 because the optimizer only handled cases where
00:16:06.800 static keywords or a single keyword splat were used now here we're calling
00:16:12.320 setup args core with a nonzero value forcing array allocations for splats in
00:16:18.560 this recursive call even though array allocations may not always be necessary now it turns out in some cases we can
00:16:25.680 know before the recursive call whether it is safe to avoid the array allocation
00:16:31.040 by replacing those two lines with this code i was able to optimize more cases so let me go over what this new code
00:16:37.040 does first realize that this code is the same as the previous code just split into two parts this is the new code
00:16:43.279 being added the first condition for the optimization is the head of the args push node is a splat node that's true in
00:16:50.720 all these cases since the first argument to the method is a positional splat the
00:16:56.639 second condition is that the body of the args push node is a hash node and that's
00:17:02.000 also true in all of these cases because literal hashes literal keywords and
00:17:07.039 keyword splats are all parsed as hash nodes and the final condition is that ND
00:17:12.640 brace is not set so ND brace is set on hash nodes for literal hashes but it's
00:17:18.720 not set on hash nodes for literal keywords or keyword splats and that affects the bottom four cases all of
00:17:25.280 which use literal keywords and or keyword splats but not the top case which uses a literal hash and does need
00:17:31.520 to allocate if all three of these conditions are true then the recursive call to set up args core is set to not
00:17:38.400 allocate an array for the splat and that eliminated the array allocations for these calls however Ruby would still
00:17:46.000 allocate an array for these calls and that's because these calls use a different parse tree i'm going to choose
00:17:52.080 these two calls to show the difference between the parse trees and the only difference between these two calls is
00:17:57.760 the bottom call has an argument before the positional splat so here's the
00:18:02.799 simplified parse tree for both calls this is the difference between the parse trees without a leading argument a splat
00:18:10.240 node is used but with a leading argument an args cat node is used arc's cat node
00:18:16.240 compilation had similar code to the args push node compilation in that it told recursive calls to to allocate a new
00:18:23.600 array however the same optimization could not be safely applied to the args cat node and that's because args cat is
00:18:31.600 also used for method calls with multiple positional splats to properly optimize
00:18:37.120 you would need to compile these two args cat nodes differently you'd want this args cat node to not allocate an array
00:18:44.640 because only a single positional splat is used but you would want this args cat node to allocate an array since there
00:18:50.720 were multiple positional splats and I just could not see how that was possible
00:18:55.840 with the setup args core API so I changed the API instead of taking an
00:19:01.360 integer for whether the current node you are compiling should allocate an array for the splat I change the argument to
00:19:07.600 be a pointer to unsigned in and this allows for tracking this information in a single place across recursive calls to
00:19:14.960 set up args core for the args push node compilation which I had previously changed to this i was able to take all
00:19:21.679 of this code and delete it as I was no longer defining this variable i just had to pass the existing pointer directly
00:19:28.640 now here's a case where the dupe rest argument was used to make a decision about whether to allocate an array for
00:19:34.960 the splat this adds the splat array instruction and the argument to the splat array instruction is whether to
00:19:40.880 allocate an array for the splat now that dupress is a pointer and not an integer needed to change this to dreference the
00:19:47.600 pointer and this code previously said that the positional splat was mutable if you were allocating a new array that
00:19:54.480 code was eliminated from this branch and other similar branches and it was moved to a centralized location to ensure that
00:20:01.600 we never allocate more than one array per method call if we did allocate an
00:20:06.720 array for the splat array argument we set the dupe rest flag to zero so that no more arrays will be
00:20:13.799 allocated now to avoid unnecessary allocations we decide whether to allocate an array for the positional
00:20:20.799 splat before the call to set up args core we default to allocating an array
00:20:26.400 and if we are sure an array allocation is not needed then we set the call to not allocate an array this is pseudo
00:20:33.039 code because it takes a significant amount of work to determine whether it is safe to avoid allocation now there
00:20:39.520 are four parse tree cases where we can avoid allocating an array the first case is when the only argument is a
00:20:46.240 positional splat so the primary argument node in this case is a splat node and if
00:20:51.520 the primary argument node is a splat node it's always safe to avoid array allocation the second case is when there
00:20:58.480 are one or more positional arguments followed by a positional splat in this case the primary argument node is an
00:21:05.280 arcs cat node and the head of the args cat node is a list node so the code checks if the primary argument node is
00:21:11.760 an arcs cat node and if the head of the args cat node is a list node then it's
00:21:16.799 safe to avoid allocation the third case is when there is a positional splat
00:21:22.000 followed by literal keywords or keyword splat the parse tree for this case has args push as the primary argument node
00:21:29.200 with the head being a splat node and the body being a hash node that is a keyword argument and the third and fourth cases
00:21:35.600 are related so I'm going to discuss the fourth case at the same time this is when there are one or more positional
00:21:40.960 arguments followed by a positional splat and then followed by literal keywords or keyword splat so in the fourth case
00:21:47.840 instead of a splat node we have an args cat node where the head is a list node so the code checks if the primary
00:21:54.480 argument node is args push and if so it checks whether the head of that node is a splat node or the head is an args cat
00:22:02.640 node where the head of that is a list node and finally the code checks whether the body of the args push node is a hash
00:22:09.120 node where nd brace is not set so with those four cases handled I was able to
00:22:14.880 avoid array allocation for all cases where a single positional splat and no
00:22:20.400 postsplat positional arguments are used unfortunately that broke tests
00:22:26.480 because it's not actually safe to eliminate the allocation for all such calls
00:22:32.480 for example if keyword is a local variable then it's safe to avoid allocation but if keyword is a method
00:22:39.200 call it's not actually safe because the method call could modify the array being splatted and that is an evaluation order
00:22:46.919 issue in order to be safe you need to inspect all subnodes of the hash node
00:22:52.159 and determine whether any of those contains an expression that could cause an evaluation order issue so I added a
00:22:59.679 function that could be passed to node and would return true if the node could potentially modify an existing argument
00:23:06.480 splat so this will allow local and instance variables most literals as well as most constant references this line
00:23:14.000 handles nested constants which may not be safe for example this type of nested
00:23:19.120 constant reference where all parts are constant references is safe enough to not force allocating an array for the
00:23:25.760 splat however this type of constant reference which looks for a constant under a module returned by a method call
00:23:33.280 is not safe because the method call could modify the array being splatted with that function defined the remaining
00:23:40.080 step is to check all of the key and value nodes for literal keywords as well as the nodes for keyword splats so I
00:23:46.880 added this code we start with the hash node for literal keywords and keyword splats all keys and values of literal
00:23:54.080 keywords as well as keyword splat expressions will be inside this hash node each entry in this hash node is
00:24:01.760 either a key value pair or it's a keyword splat we're going to iterate over all of these now the head of this
00:24:08.400 node is a key node for a literal keyword or it's null for a keyword splat and if
00:24:14.080 the key node is set we call a function to determine whether the node is safe if the node is not safe we mark that we
00:24:20.559 need to allocate an array for the splat then we stop processing if it is safe we move to the next node and we get the
00:24:27.120 head node for that which is either the value node for a literal keyword or it's the node for the keyword splat
00:24:33.159 expression and we do the same check that we did for the key node and we handle unsafe nodes the same way and we repeat
00:24:39.360 this process for every keyword or keyword splat in the method call turns out that in addition to checking
00:24:45.679 the keyword arguments if a block is passed via an amperand we also need to check the block pass expression so
00:24:52.720 here's the code for that first if we already know that we're going to duplicate the array anyway then we don't
00:24:58.720 need to check the block pass expression we only want to do this check if a block pass expression is provided and we do
00:25:05.919 the same check on the block pass expression that we did on the keywords if the block pass expression is not safe
00:25:12.159 then we must duplicate the array being splatted and with that change all cases
00:25:17.360 that unnecessarily allocated a caller side array for a positional splat in
00:25:22.440 Ruby3 no longer do so in Ruby34 additionally I was able to remove the
00:25:28.000 entire optimizer change that I added in Ruby 33 because this compiler change
00:25:33.039 handled all of the cases that the optimizer handled as well as all these additional cases
00:25:39.200 a while after making that optimization I found a related evaluation order issue in this type of method call here we are
00:25:46.559 keyword splatting a variable and since this is a single keyword splat without literal keywords would generally not be
00:25:53.880 duplicated however in the same method call we are mutating the variable in the block pass expression if we don't
00:26:00.159 duplicate the keyword splot expression then this is an evaluation order issue because the deleted key will not appear
00:26:06.799 as a keyword argument so to fix this evaluation order issue we go back to the setup args code we were previously only
00:26:14.400 using zero or one values for dupe rests but now we want dupe rest to track two types of information the first type of
00:26:21.440 information is whether the positional splat array should be duplicated and the second type of information is whether
00:26:27.679 the keyword splat hash for a single keyword splat should be duplicated now in order to do that I
00:26:34.240 created macros for these two conditions splat array true is used for the condition where you need to duplicate
00:26:39.600 the splat array and dupe single keyword splat is used for the condition where you need to duplicate the keyword splat
00:26:45.400 hash now that we want to set dupe single keyword splat even if we're already
00:26:50.480 going to be duplicating the splat array so we need to remove this condition now that we have set that flag
00:26:56.640 we need to change the keyword compilation code to use it and this is what the code looked like before the change if the keyword splat node was not
00:27:04.400 a single keyword splat we mark the method as containing a mutable keyword and then we call compile hash with the
00:27:10.400 keyword node which again may be literal keywords and or one or more keyword splats now here's the code I added to
00:27:16.320 fix the evaluation order issue this is sections are basically the same as the previous code just split into two parts
00:27:22.400 this is the new code if a keyword node is a single keyword splat and the dupe
00:27:28.320 single keyword splat flag is set then we call a new function designed only to
00:27:33.400 compile single keyword splats where we need to duplicate the splat hash in order to avoid the evaluation order
00:27:39.919 issue so here's the definition of that function first since we're going to be
00:27:44.960 allocating a new hash for the keyword splat we mark the method as passing a mutable keyword splat this is the same
00:27:51.600 compile hash call that we had previously so this will compile the keyword splat hash without duplicating it and these
00:27:58.640 instructions create a new empty hash before evaluating the keyword splat expression and merge the keyword splat
00:28:05.039 hash into the new hash so other words these are the instructions that are duplicating the keyword splat hash and
00:28:10.799 with that change this code no longer has an evaluation order issue as it now duplicates the keyword splat hash before
00:28:16.960 calling delete on it i'd like to finish up this presentation by discussing the remaining implicit allocations in method
00:28:23.120 calls and whether they could be fixed let me start with caller side implicit allocations first having arguments after
00:28:30.799 a positional splat or having multiple positional splats ruby's internal method calling API does not support this it
00:28:37.360 only supports a single positional splat so in both of these cases the caller needs to allocate an array and combine
00:28:43.279 these arguments into the array and that's not fixable without major internal changes second having literal
00:28:49.279 keywords and a keyword splat or having multiple keyword splats again Ruby's internal method calling API does not
00:28:54.720 support this it only supports only literal keywords or only a single keyword splat expression so in both of
00:29:01.440 these cases you must allocate a hash and combine these arguments into a hash that's also not fixable without major
00:29:07.520 internal changes finally a positional splat followed by non-static literal
00:29:13.200 keywords this currently allocates a hash even though it does not really need to because Ruby's internal method calling
00:29:20.000 API does not support both a positional splat and literal keywords only a positional splat and a keyword splat so
00:29:27.440 this internally converts the literal keywords to a keyword splat and I think this case is much easier to fix than the
00:29:33.600 other two cases now let me discuss collie side implicit allocations first
00:29:39.200 having a named positional splat parameter requires array allocation think this allocation is unavoidable in
00:29:44.960 the general case until we have a generic deoptimization framework because any method call could result in an eval or
00:29:50.880 access to the binding second having a named keyword splat parameter requires hash allocation i think this allocation
00:29:57.840 is unavoidable for the same reason and finally calling a method that only accepts positional arguments with
00:30:03.600 literal keywords or keyword splat the collie automatically converts the keywords to a hash and I think this
00:30:08.880 allocation is unavoidable for the same reason so I hope you had fun learning about the allocation reduction
00:30:14.080 optimizations in Ruby34 if you enjoyed this presentation and want to read more of my thoughts on Ruby programming
00:30:19.200 please consider picking picking up a copy of Polish Ruby programming and that concludes my presentation i'd like to
00:30:24.559 thank all of you for listening to me if you have any questions please ask me during the break thank you
Explore all talks recorded at RubyKaigi 2025
+66