Deoptimization: How YJIT Speeds Up Ruby by Slowing Down


Summarized using AI

Deoptimization: How YJIT Speeds Up Ruby by Slowing Down

Takashi Kokubun • April 16, 2025 • Matsuyama, Ehime, Japan • Talk

Deoptimization: How YJIT Speeds Up Ruby by Slowing Down

In this presentation, Takashi Kokubun discusses how YJIT (Yet Another Just-In-Time compiler) enhances Ruby's performance by employing counterintuitive strategies that sometimes involve slowing down the code execution to optimize overall performance. Kokubun explains the evolution of Ruby's performance optimization, emphasizing the significance of deoptimization as a critical technique.

Key Points Discussed:

  • Background and Role: Kokubun, a member of the YJIT team at Shopify, collaborated on enhancing Ruby's infrastructure, including introducing the YJIT and working on the upcoming ZJIT compiler.

  • YJIT Overview: YJIT converts Ruby's virtual machine instructions into machine code that can be executed directly, leading to substantial performance improvements. For example, benchmarks indicate that YJIT can make Rails applications twice as fast compared to the traditional interpreter.

  • Performance Metrics: In real-world scenarios, such as high-traffic storefronts, YJIT achieves an average speedup of 18% with peaks reaching 33% improvement.

  • Deoptimization Explained: The concept of deoptimization is key to YJIT's functionality. This allows the compiler to slow down execution deliberately to handle Ruby's dynamic features more effectively. The idea is to discard optimizations when Ruby's dynamic nature interferes, enabling higher overall throughput.

  • Practical Examples: Kokubun shares practical coding examples showcasing how YJIT handles method redefinitions, constant updates, and trace points that can disrupt performance. For instance, he discusses how using line trace points can invalidate optimizations, leading to performance degradation.

  • New Deoptimizations in Ruby 34: The talk introduces new deoptimization strategies introduced in Ruby 34, such as escape locals for variables and lazy frame pushes. This involves optimizations that dynamically adjust based on runtime context, significantly enhancing performance while maintaining Ruby's dynamic nature.

  • C Trace Attribute: Kokubun highlights the introduction of a C trace attribute in Ruby to ensure smoother transitions between C and Ruby implementations, reducing the likelihood of discrepancies during method execution.

Conclusion:

  • The overarching takeaway is that through strategic optimizations and the intelligent use of deoptimization, YJIT manages to enhance Ruby's performance without compromising its core dynamic functionality. The advancements introduced in Ruby 34, especially regarding local variables and method calls, signify a progressive step towards a faster Ruby.

Overall, this presentation underscores the innovative approach Ruby is taking to improve speed and efficiency, with YJIT as a pivotal contributor to these enhancements.

Deoptimization: How YJIT Speeds Up Ruby by Slowing Down
Takashi Kokubun • Matsuyama, Ehime, Japan • Talk

Date: April 16, 2025
Published: May 27, 2025
Announced: unknown

Have you wondered why Ruby keeps getting faster at every release despite challenges like handling metaprogramming and dynamic typing? In this talk, you'll discover how YJIT "hides" Ruby's sources of slowness by sometimes "slowing down" Ruby, and why this counterintuitive strategy is key to its performance gains.

https://rubykaigi.org/2025/presentations/k0kubun.html

RubyKaigi 2025

00:00:13.280 hi everyone um today I'm going to talk
00:00:16.640 about the optimization how W speeds up
00:00:20.000 Ruby by slowing
00:00:22.199 down my name is Kokabun uh I work on the
00:00:26.240 W team at Shopify y team is part of the
00:00:29.519 Ruby infrastructure team we maintain a
00:00:33.440 lot of Ruby related open source fulltime
00:00:36.320 and improve the ecosystem um yeah and
00:00:41.320 um news is we're no longer working on
00:00:44.719 Yet we are like these days working on
00:00:48.320 another like new compiler called Zjit
00:00:51.039 maxim is going to give a talk about it
00:00:53.520 tomorrow so if you're interested um talk
00:00:57.360 like yeah go to the talk and but today's
00:01:01.520 topic is still relevant to the zjit as
00:01:03.840 well um the technique we are going to
00:01:07.119 talk about today is going to be
00:01:09.600 necessary for that compiler too so don't
00:01:12.560 worry about it and another thing is we
00:01:16.000 also hiring uh it's kind of rare
00:01:18.400 position if you're interested in working
00:01:20.159 on the open source project fulltime the
00:01:23.040 position is open This QR code is
00:01:25.280 specific to the Rubik ID attendees so
00:01:28.240 scan the QR code and apply for it now
00:01:30.720 because the permissions are limited and
00:01:33.360 if it's gone it's gone um if you also
00:01:36.640 want to work on digit now is the best
00:01:38.799 time because it's not finished yet so
00:01:41.520 like you can build the infrastructure
00:01:43.680 and the core implementations of it so
00:01:46.799 yeah let's build GI together with us and
00:01:51.119 I also maintain the latest stable branch
00:01:53.680 of Ruby releases uh I just released 343
00:01:57.680 this Monday um we also kind of
00:02:00.399 collaborate on backboarding things to
00:02:02.960 the Ruby 34 bench with uh inside the
00:02:05.840 Ruby infrastructure team i still
00:02:08.000 maintain the merge of everything but uh
00:02:10.560 we like share the maintenance of this
00:02:13.120 bench uh this
00:02:14.680 year so today's talk is about wget y
00:02:18.400 stands for yet another just in time
00:02:21.319 compiler and just in time compiler does
00:02:24.720 uh optimization
00:02:26.920 by switching from the interpreter uh
00:02:30.480 which interprets the virtual machine
00:02:32.640 instruction that are specific to the C
00:02:35.319 impmentation to machine code that can be
00:02:38.480 natively executed by Intel or ARM um
00:02:42.360 CPU and um the performance today is like
00:02:47.120 Um this is a speed of widget or website
00:02:51.120 which shows the widget's performance
00:02:53.599 compared to the interpreter on various
00:02:55.440 benchmarks and for example on the rails
00:02:58.080 bench which uh performs the active
00:03:00.879 record queries and do the HTML rendering
00:03:04.599 um it's twice as fast as interpreter
00:03:07.519 today um so uh even like with the IO we
00:03:11.840 can do a lot of transitions on those
00:03:14.120 benchmarks and this is still benchmarks
00:03:17.200 and not the real world um workload but
00:03:20.159 in the actual production workloads this
00:03:23.040 is the performance of uh storefront
00:03:25.519 vendor which has the like highest
00:03:27.360 traffic in our company and uh it shows
00:03:30.480 like 18% speed up on average and like
00:03:33.599 33% speed up on like region and Like we
00:03:37.840 also deploy widget on the Shopify's
00:03:41.120 largest monate as well so it's
00:03:43.120 production ready widget I mean sorry
00:03:44.720 production ready G compiler and also
00:03:47.120 like not inside not not only Shopy but
00:03:49.760 also other companies uh enable Y in
00:03:52.080 production as well these are just the
00:03:53.760 articles that are written this year but
00:03:56.319 from yes last year like years ago like
00:03:58.959 we've seen a lot of other articles that
00:04:01.280 said we enable Y in production so please
00:04:04.640 do that if you haven't but also another
00:04:07.920 thing is uh from rails 72 if you're
00:04:12.159 using Ruby newer than 33 or equal to 33
00:04:16.239 um it's enabled by default by rails so
00:04:19.519 if you upgraded Rails to 372 or newer or
00:04:23.600 3 Ruby 33 or newer and you switch the
00:04:26.479 defaults to 372 or newer then you might
00:04:29.199 be already running magic in production
00:04:31.040 without
00:04:32.759 noticing so and today's focus is going
00:04:36.320 to be on the something called the
00:04:38.320 optimization
00:04:40.040 um if you may have attended Rubik Kagi
00:04:44.240 before you might remember this Rubik's
00:04:47.360 2016 talk called the optimizing Ruby
00:04:50.320 done by Shi and um that was about like
00:04:56.400 optimizing Ruby interpreter by doing the
00:05:00.720 something called deoptimization
00:05:02.880 and that idea was pretty interesting and
00:05:06.800 it didn't end up being merged to the
00:05:08.720 master branch but today as of Ruby 34 or
00:05:13.600 35 we have the some like mechanism
00:05:16.479 called demization master branch and like
00:05:18.320 it's already released in 33 31 32 yeah
00:05:22.400 every
00:05:23.320 like has the mechanism called
00:05:26.680 demization so what is it it's u so what
00:05:30.880 if you can slow down Ruby at any time
00:05:33.919 you are probably not interested in
00:05:35.759 slowing down Ruby's performance but um
00:05:39.039 let's say you want to do some
00:05:40.479 optimization and if you can cancel it at
00:05:43.360 any time the optimization could be
00:05:45.520 anything like you can do whatever you
00:05:48.080 want and just throw it away if it's
00:05:50.600 necessary so that way um we can kind of
00:05:54.240 forget about Ruby's dynamic features
00:05:56.400 that prevents Ruby from being faster so
00:05:59.759 that's the something we call as the
00:06:02.919 optimization and to uh kind of give you
00:06:06.240 the hands experience um if you are
00:06:09.360 interested uh you can try building Ruby
00:06:12.319 with uh the configure flag called enable
00:06:14.720 ydev and if you build Ruby that way um
00:06:18.479 the CB is going to support this extra
00:06:21.199 command line flag called widget this
00:06:23.919 which shows the uh machine code for
00:06:26.240 every single method you have compiled
00:06:28.319 with widget so like for example um you
00:06:32.000 if you build the Ruby with enable ydev
00:06:35.600 the -v with y is going to show plus
00:06:39.840 widget dev it's usually just press
00:06:41.759 widget but if it's press widget dev it
00:06:44.080 means it's a development mode of widget
00:06:46.000 so you can show the machine code with a
00:06:48.400 uh dash dump this option and the example
00:06:51.840 is like this um it's I'm going to talk
00:06:54.960 about this like similar code in the next
00:06:57.520 slides but uh we just define the method
00:07:01.120 and code it and redefined it and then
00:07:03.840 shows um the machine code that was
00:07:06.319 generated for the optimization purposes
00:07:11.560 so I'm going to next talk about um
00:07:15.280 traditional Y optimization de
00:07:17.199 optimizations that have existed since
00:07:19.680 3.1 Ruby 31 so the the major thing we do
00:07:23.919 for the optimization is called code
00:07:26.039 patching so because we maintain the
00:07:30.560 every single bit of machine code we
00:07:32.560 generate we know exactly which address
00:07:36.639 should be invalidated when we need to do
00:07:39.039 so so for example let's say you have
00:07:43.680 Ruby code like this you define a
00:07:45.919 constant called fu and it's one and just
00:07:49.120 define a method called lowerase fu and
00:07:51.759 just refer to the constant the method
00:07:55.039 who should return one so uh the reg just
00:07:59.360 look at the content of the constant and
00:08:02.479 uh embeds the actual uh content of the
00:08:05.199 con constant like um three so as you may
00:08:09.199 know um integer is uh left shifted once
00:08:14.000 so it's like integer one is a three in
00:08:16.560 the machine code and um but if you do
00:08:20.160 this then the machine code will always
00:08:23.280 uh return one integer one from the
00:08:26.240 method fu but it's not necessarily true
00:08:29.800 if the constant is defined because
00:08:32.560 Ruby's constants are just another kind
00:08:34.320 of global variable it's not actually a
00:08:36.959 constant and like could be redefined at
00:08:39.039 any time and when when it happens what
00:08:41.760 reg does is just patch the code and uh
00:08:45.519 rewrite to another instruction called
00:08:47.760 jump and if we do this jump uh that
00:08:51.120 could go to anything for example the the
00:08:55.760 thing we actually do is jump to the um
00:08:59.080 trampoline which jumps to the
00:09:01.600 interpreter goes back to the interpreter
00:09:03.120 from the G-code so by doing this uh we
00:09:06.320 can cancel the optimization of inlining
00:09:08.640 the content of uh fu constant and then
00:09:11.680 goes back to the interpreter when it uh
00:09:14.720 executes this um method
00:09:17.800 again and I also like to introduce uh
00:09:21.440 another uh audio or transition called
00:09:24.320 global invitation so this is under Ruby
00:09:28.880 code i want you to think about what it
00:09:32.120 returns
00:09:33.640 um raise your hand if you think this is
00:09:36.480 not going to return
00:09:39.399 one you win so of course it's going to
00:09:44.160 return 5,000 trillion because it's Ruby
00:09:48.560 um so
00:09:51.399 there's thousands of features that break
00:09:54.399 this kind of optimization in Ruby and um
00:09:58.640 this example has a trace point uh in
00:10:02.000 particular it's a line trace point so if
00:10:04.959 you define a line trace point event um
00:10:07.600 when you execute another line you hook
00:10:10.480 the execution of the Ruby and do
00:10:12.720 anything there so in this example um
00:10:16.480 when you call the number method u after
00:10:19.760 executing the one equals sorry one
00:10:22.160 equals one yeah one equals one then it's
00:10:24.560 going to rerun the block of the uh trace
00:10:27.600 point and it set the uh local variable
00:10:30.959 again to 5,000 trillion so of course
00:10:34.399 it's going to return the number we
00:10:36.560 generated in the trace
00:10:38.839 point and when we do that because trace
00:10:42.720 point messes up everything we just throw
00:10:45.760 away everything so like this example
00:10:47.920 shows like it's not the code we just
00:10:50.800 showed but uh it's just on the other
00:10:53.680 benchmark and like when trace point line
00:10:56.800 event is enabled we just generate
00:10:59.279 thousands of jump instruction to the
00:11:01.440 side um exit code to the interpreter so
00:11:06.079 um yeah this is what happens if you use
00:11:08.079 line trace point so today's takeaway is
00:11:10.399 like don't use line trace point in
00:11:15.240 production so these are the existing
00:11:18.800 traditional uh deoptimizations we had in
00:11:21.519 Y for years but uh I'm also going to
00:11:25.360 talk about new deoptimizations uh that
00:11:28.560 we added to Ruby
00:11:30.760 34 so the first thing I want to talk
00:11:33.519 about is the invitation of escape locals
00:11:37.760 uh so this is another example obviously
00:11:39.760 it's going to return 5,000 trillion but
00:11:42.480 uh I'm not going to use trace point but
00:11:44.800 uh can you guess what kind of re
00:11:46.480 features you could use in the do
00:11:51.000 something so somebody called binding but
00:11:53.839 yeah it is so like um not just the
00:11:57.279 binding of the current frame but any
00:11:59.920 random Ruby method could look at the
00:12:02.480 caller frame arbitrary caller frame
00:12:05.120 using the C like public like official C
00:12:08.880 API we provide for messing up the Ruby
00:12:11.839 um you can look at the caller frame and
00:12:15.440 then mess up the frame basically so
00:12:18.880 whenever you call some arbitrary method
00:12:20.800 that's not in line to your G-code uh it
00:12:23.519 could just mess up the Ruby local
00:12:25.839 variable so it's not guaranteed that the
00:12:28.639 Ruby local variables are not going to be
00:12:31.360 uh modified by the co methods even if
00:12:34.639 you are not passing the block to the
00:12:37.440 caller sorry
00:12:39.560 method and with that uh we introduced a
00:12:43.519 new optimization called uh local
00:12:45.360 variable resurgocation to Ruby 34 we
00:12:49.040 used to just write to memory like in
00:12:51.920 Ruby 33 we were writing only to the
00:12:55.279 memory when we need to deal with the uh
00:12:58.000 Ruby locals but from 34 we use registers
00:13:02.000 as if it were the regular compiler and
00:13:05.040 of course we are going to throw away the
00:13:07.519 code when the binding is fetched by the
00:13:10.079 any like co or
00:13:13.560 wherever so another example is uh I want
00:13:17.120 to talk about is called uh invitation on
00:13:19.279 singleton classes so it's another
00:13:21.839 optimization we added to 34 so in this
00:13:24.800 example um the example method defines
00:13:28.639 the string local variable and then uh
00:13:34.040 concatenate another string uh returned
00:13:37.279 from the define method that's bar so if
00:13:40.480 you don't um specify the true to the
00:13:43.839 flag then it's going to uh concatenate
00:13:47.360 empty string and bar so it's going to
00:13:48.880 return bar and the next step is going to
00:13:52.880 uh set true to the flag so it
00:13:55.279 redefineses the string plus method on
00:13:58.240 that specific string object so instead
00:14:01.519 of concatenating empty string and the
00:14:04.160 bar it's going to just return the f
00:14:06.600 string so if you run this script with
00:14:10.160 interpreter uh the correct result is
00:14:12.720 going to be uh print bar first and then
00:14:15.199 foo next but
00:14:18.199 um in the if you think about how do you
00:14:21.360 compile this in y um as of evaluating
00:14:25.360 the receiver of the pro instruction the
00:14:28.199 string is pushed to the like virtual
00:14:31.600 stack and then as of that because it's
00:14:34.079 initialized by the string little of
00:14:36.079 course it's going to be bare string
00:14:37.680 nothing happens to the string yet even
00:14:39.839 if the flag is true so because the
00:14:43.120 string is pushed to the bure stack
00:14:45.279 before executing the define method the J
00:14:48.800 g compiler thinks it's going to be the
00:14:50.720 bare string and after pushing that to
00:14:53.760 the stack it's going to call the define
00:14:55.839 method and because the flag is true it
00:14:58.399 redefineses a string plus uh method on
00:15:00.880 that particular object so we swap the uh
00:15:04.480 class field or the object to the
00:15:06.880 singleton class that has this special
00:15:09.760 method definition like definition and
00:15:12.959 after doing so because J thinks it's a
00:15:17.440 string uh it may think uh you don't need
00:15:21.920 to check if it's actually a string so if
00:15:24.480 we don't do anything with it um it could
00:15:27.680 be just like um return bar because we
00:15:30.800 skipped the red definition and we
00:15:33.040 already compiled assuming that um the
00:15:35.920 string plus is going to be the actual
00:15:38.560 string plus so this is a miscomp
00:15:40.880 completion that could happen if you
00:15:42.560 don't invite it on the single classes so
00:15:46.000 what we do today with 3 Ruby 34 is that
00:15:49.680 um when singleton classes are created on
00:15:53.199 particular classes we track for example
00:15:55.440 like string uh array and hash um for
00:15:59.279 those objects um we check if any
00:16:03.440 singleton class is created for those
00:16:05.279 classes and if it's any singleton class
00:16:07.759 for those classes we just skip the uh
00:16:11.839 like alli the type check for those
00:16:14.000 classes and um invite it when um the
00:16:18.160 string uh single class is
00:16:21.399 defined uh the next thing I want to talk
00:16:24.320 about is lazy frame push
00:16:27.160 so with this example we have the string
00:16:32.160 empty string and set by zero uh just
00:16:35.120 letter a and um because the empty string
00:16:40.079 doesn't have any length you can't set
00:16:42.560 the um character to the index zero
00:16:45.279 because it's the length is zero so it
00:16:48.320 should raise the exception like index
00:16:51.680 zero out of string um but the widget
00:16:57.120 optimizes this string set by method so
00:17:00.880 we could behave like make it behave like
00:17:04.400 this so like um if you don't have any
00:17:07.640 invitation y could just skip pushing the
00:17:10.880 frame for the set bite and just uh
00:17:13.120 inline the optimized instructions for
00:17:15.520 the set by implementation and then if
00:17:17.760 you do that um even if the argument is
00:17:21.120 invalid and should raise error because
00:17:24.079 we haven't pushed the method frame for
00:17:26.319 the set bite the behavior could be the
00:17:29.760 the lower one like which doesn't say
00:17:32.559 string set by in the structures um so
00:17:36.480 it's the wrong behavior um the thing we
00:17:39.840 introduced to 334 is the lazy frame push
00:17:44.400 if you lazily sorry um when the set by
00:17:48.960 sees the invite um argument it
00:17:51.919 internally um calls the CB um API that
00:17:57.200 allocates a new
00:18:00.520 um new exception object and when we
00:18:04.160 reach the uh C function that outdates
00:18:06.320 the exception object um it calls back
00:18:09.440 the widget hook that uh checks if it has
00:18:13.360 registered the invalidation for that um
00:18:16.720 program location and um if it's already
00:18:20.240 registered then uh we lazy fra uh push a
00:18:23.039 frame to the stack when
00:18:25.720 necessary and the way it works is um we
00:18:30.480 have uh this is the first time showing
00:18:32.880 the Ras code in the Rubik but um it's
00:18:36.520 a looks at the um hash table that has
00:18:40.080 the uh program counters as keys program
00:18:43.440 counter is like one to one mapping to
00:18:45.520 the program specific location and the
00:18:48.640 values are the something that we need
00:18:50.960 for pushing the frame basically um it
00:18:53.760 has the location of the receiver on the
00:18:56.000 stack and also has the method entry so
00:18:58.640 that we can materialize the method uh
00:19:01.520 content sorry the frame content and the
00:19:05.440 trick is like because method any method
00:19:09.280 could be executed from a same location
00:19:12.240 like for example even if you call the
00:19:15.039 set by method depending on the receiver
00:19:18.160 of the object the set bite could be
00:19:20.160 something different for example you can
00:19:21.840 redefine the set bite for string or any
00:19:24.559 other receiver could be used so uh the
00:19:28.480 program counter is not necessarily tied
00:19:30.720 to a specific method entry so um because
00:19:35.919 we kind of assumed that the set bite is
00:19:39.600 not going to be u polymorphic call um we
00:19:43.919 assume that it's a one to one mapping
00:19:45.840 and if it's not we can just side like
00:19:48.320 exit to the interpreter when that
00:19:49.919 happens so that way we can uh calculate
00:19:53.360 the um the frame content we need to push
00:19:57.760 lazily based on the program counter um
00:20:01.280 which we have to set for other reasons
00:20:04.240 like uh tracing um and the side exits
00:20:07.520 and so we set the program counter anyway
00:20:10.080 so we use that for materializing the
00:20:12.720 frame
00:20:14.280 later um the next thing I'm going to
00:20:16.799 talk about is the widget only due
00:20:18.720 methods so as you may know Ruby is
00:20:22.080 faster than C so
00:20:24.520 we have method definitions like this for
00:20:28.320 array each so this is the like
00:20:31.679 historical uh C implementation for array
00:20:34.320 each we attempted to just replace that
00:20:36.720 with Ruby base array each um which we
00:20:41.200 didn't end up doing but um in the
00:20:44.840 microbenchmark the Ruby version of the
00:20:47.280 array each was actually faster even on
00:20:50.159 the interpreter um but if you run the
00:20:53.919 benchmark like larger large enough
00:20:55.760 benchmark then uh it actually performed
00:20:58.400 worse which is why um we are now uh
00:21:02.240 thinking of like doing the switching
00:21:04.400 between the C and Ruby version so uh
00:21:08.159 with YJIT because going to Ruby to C and
00:21:13.600 then going back to uh JIT from the C
00:21:16.480 world is actually slow um so we don't
00:21:19.600 want to cross the do the boundary
00:21:22.159 crossing especially the C to Ruby is a
00:21:25.039 like bad bad thing so to avoid that um
00:21:29.039 like if you define array each in C then
00:21:32.320 you must uh cross the boundary like from
00:21:35.120 C to Ruby because the block is often
00:21:38.000 just defined by Ruby so we shouldn't
00:21:41.520 define array each in Ruby if you want to
00:21:44.320 not make that not happen
00:21:47.240 so uh this is the complicated code we
00:21:52.960 write that's actually shipped in the
00:21:54.640 Ruby 34 um so it's actually not really
00:21:59.280 deoptimization but like achieves the
00:22:01.520 same kind of purpose so what it it does
00:22:04.880 is it checks if the core array each C
00:22:08.880 implementation has been redefined and if
00:22:11.520 it's not redefine then go like move on
00:22:14.640 to execute those like content and just
00:22:17.520 undefine the C based implementation
00:22:21.240 arrine the Ruby base array each so and
00:22:26.159 the thing I want to talk about is like
00:22:29.240 uh this one so this is new in Ruby 34
00:22:34.240 and it's called like C trace attribute
00:22:37.520 and um this primitive ATR exclamation is
00:22:41.919 the uh like special syntax that's
00:22:45.039 specific to the C2B internal core
00:22:48.080 classes that were developed by Kohichi
00:22:51.280 before and if you this is not a
00:22:54.000 methodical but it just uh parsed by
00:22:57.360 prison and then like uh compiled into
00:23:00.159 special binary that's only possible
00:23:02.159 inside the CB core classes and if you
00:23:04.880 have the C trace attribute this method
00:23:07.600 entry is going to have the flag called C
00:23:10.320 trace and if it does have the C trace
00:23:13.039 flag what is that is um so this is the
00:23:18.480 um original behavior of the exception
00:23:20.960 raised inside the block of the G given
00:23:23.600 to array each so if you raise something
00:23:26.320 inside the block given to array each uh
00:23:29.440 the back of course it should um have the
00:23:32.320 array each entry and the interesting
00:23:35.200 part is
00:23:37.000 um because C array each is defined in C
00:23:41.520 in this example um it shows the file
00:23:45.039 name called like dash E which is the
00:23:47.679 dash given to the Ruby command however
00:23:51.679 um what happens if we uh redefine that
00:23:55.679 in Ruby is usually like this so the
00:23:58.159 lower one is the version that defines
00:24:01.120 array h in Ruby so um when we
00:24:04.960 reimplement something in Ruby um the
00:24:08.360 back has the uh Ruby file name i mean
00:24:12.159 it's actually not the file name like
00:24:14.240 just it's internal array for core
00:24:16.320 classes defined in Ruby but uh the file
00:24:20.080 name is going to be different like for
00:24:22.240 some reason C methods just look at the
00:24:25.039 caller flames location and shows dash e
00:24:28.240 which is not actually the location that
00:24:30.240 define the ar written in C but um anyway
00:24:34.080 if it's rewritten in Ruby uh this is
00:24:37.039 going to be different but this is
00:24:38.960 something we don't want to hap to happen
00:24:41.600 because
00:24:43.480 um
00:24:45.000 the if this happens like we redefine so
00:24:49.520 if you enable Yet um we because of the
00:24:54.400 uh with Vit helper um we switch swaps
00:24:58.640 the uh implementation of the array each
00:25:00.960 only if the V is enabled so uh the what
00:25:05.600 happens in real world is if you you
00:25:08.559 enable y suddenly your test cases that
00:25:12.880 match the back is going to break like
00:25:15.679 for some reason people like to test back
00:25:17.520 traces so these tests are going to break
00:25:20.080 just because you enable rigit so we
00:25:22.720 don't want that to happen so like um to
00:25:25.919 prevent that we added the C trace
00:25:27.760 attribute and that's what we do for
00:25:30.240 fixing or dealing with this problem so
00:25:32.799 like um the switch from the C like dashy
00:25:36.799 to the internal array happens all the
00:25:38.799 time when you upgrade the Ruby minor
00:25:41.360 versions because minor versions should
00:25:43.200 have backward incompatibilities but
00:25:45.760 switching in like in between the
00:25:47.520 interpreter and should be as smooth as
00:25:49.679 possible so uh the C trace uh attribute
00:25:53.039 is what we have for uh fixing that
00:25:56.919 problem uh I talk about it a bit fast
00:26:01.200 but that now comes to conclusion so the
00:26:04.080 de the optimization enable
00:26:06.039 sculative optimizations with like lazy
00:26:09.120 inviting throwing away the code later
00:26:11.679 and um Ruby 34 optimizes uh W optimizes
00:26:15.600 the local variables and method goals
00:26:17.279 using the technique called
00:26:18.480 deoptimization is which is the
00:26:20.400 conclusion of today thank you for coming
00:26:22.240 to the talk
Explore all talks recorded at RubyKaigi 2025
+66