Summarized using AI

Toward Ractor local GC

Koichi Sasada • April 18, 2025 • Matsuyama, Ehime, Japan • Talk

The talk presented by Koichi Sasada at RubyKaigi 2025 focuses on the challenges and solutions regarding garbage collection (GC) in Ruby, particularly in a multi-Ractor environment. Sasada highlights the need for better performance and concurrency in GC processes, as the current method of shared global heaps requires stopping all Ractors during GC, severely limiting performance.

Key Points Discussed:

  • Introduction to Ractor Model: Ractors, introduced in Ruby 3.0, allow for parallel computation and isolated object spaces, improving concurrency and safety in Ruby applications.
  • Challenges of Current Garbage Collection: The global garbage collection approach causes all Ractors to halt, leading to inefficient performance. It retrieves roots and marks live objects, yet misses parallel execution benefits.
  • Proposal for Ractor-local GC:
    • Local GC Idea: Each Ractor could run its own GC independently, targeting non-shareable objects in parallel.
    • Complexities with Shareable Objects: Tracking references to shared objects (like classes and modules) is crucial due to their potential accessibility from multiple Ractors, which complicates independent GC operations.
    • Conservative Local GC Implementation: Sasada proposes treating shared objects as always alive during local GC runs, which means marking them as roots, thereby reducing the chances of inadvertently freeing them. This restricts memory but allows significant performance gains.
  • Performance Testing and Results: The implementation showed substantial improvements in microbenchmarks, with speedups reaching nearly five times in specific cases, especially when many Ractors handle short-lived objects more efficiently than the current global collection method.
  • Future Goals: The aim is to finalize the implementation for Ruby 3.5 and expand parallelization into global GC as necessary, enhancing overall performance while gradually exploring more complex solutions in the future.

Conclusion:

Sasada concludes that while the proposed Ractor-local GC is not the perfect solution, it promises significant performance improvements for Ruby's garbage collection by leveraging the isolation provided by Ractors, allowing them to operate concurrently without the bottlenecks seen in the current global GC approach.

Toward Ractor local GC
Koichi Sasada • Matsuyama, Ehime, Japan • Talk

Date: April 18, 2025
Published: May 27, 2025
Announced: unknown

Garbage collection (GC) is a big challenge in multi-Ractor systems. In the current implementation, all Ractors share a global heap, and GC requires stopping all Ractors to perform GC. This approach negatively impacts object allocation and GC performance.

Ideally, GC should be executed in parallel for each Ractor's local heap. However, shareable objects make it difficult to track references and implement fully parallel GC.

To address this, we propose Ractor-local GC, which focuses on collecting non-shareable, Ractor-local objects in parallel, while continuing to perform global GC for shareable objects by pausing all Ractors, as in the current system. This hybrid approach allows most GC operations to run in parallel, improving overall performance. Furthermore, since shareable objects are expected to be relatively few compared to non-shareable objects, the frequency of global GC should remain low.

This talk will present the design of Ractor-local GC and the progress of its implementation.

https://rubykaigi.org/2025/presentations/ko1.html

RubyKaigi 2025

00:00:36.399 so most of japanese people uh talk in
00:00:39.680 english as the first uh first contact so
00:00:44.960 uh i try to uh say uh hello in japanese
00:00:49.280 but the this presentation i will uh
00:00:52.239 present in english so please
00:00:55.719 enjoy so today i'd like to talk about
00:00:58.960 garbage collection in luby more
00:01:02.160 specifically how we can move toward
00:01:05.680 lecture local garbage collection for
00:01:09.600 better
00:01:10.600 concurrency and performance so let's get
00:01:15.400 started so let me start with the
00:01:18.520 conclusion to uh of this talk so
00:01:22.680 introducing local di is
00:01:26.520 promising
00:01:28.600 direction but it's also a very
00:01:31.600 challenging
00:01:32.600 one the biggest issue is cross reference
00:01:38.240 as soon as objects start being shared
00:01:41.040 things get
00:01:42.759 complicated so we propose a conservative
00:01:46.600 approach we track all sharable objects
00:01:49.759 and uh treat them as gc roots as long as
00:01:54.479 the number of such object is small this
00:01:57.280 is a feasible approach it behaves the
00:02:00.719 same as the current garbage
00:02:03.079 collection which always does global
00:02:05.920 garbage collection
00:02:08.479 so we've observed s significant
00:02:11.680 performance improvement in
00:02:13.120 microbenchmarks with this approach so
00:02:16.480 i'll explain uh how we got to this point
00:02:20.720 in the following
00:02:23.720 slides uh before going into the
00:02:26.480 technical details i want to acknowledge
00:02:29.440 that the uh that the work is
00:02:32.800 collaboration with loit manon he was a
00:02:36.800 google sab code developer in
00:02:40.360 2022 and i had uh the pleasure of
00:02:43.760 mentoring him on the project uh titled
00:02:47.879 optimizing garbage question for ruby lus
00:02:51.760 so this is the first step of the l first
00:02:55.360 trial of the uh l local garbage
00:02:59.640 collection i initially proposed the core
00:03:03.200 ideas and we discussed uh it together so
00:03:08.000 loit continued developing his own
00:03:10.599 implementation extending my idea over
00:03:13.599 the next three
00:03:15.000 years i learned from his experiments and
00:03:18.959 developed my own garbage
00:03:21.720 collection so today i present uh my own
00:03:25.920 presentation but this uh work this
00:03:29.599 presentation is the collaboration work
00:03:31.840 with
00:03:35.080 light uh let me introduce myself so over
00:03:38.640 the year i'm co sasada uh over the years
00:03:41.840 i worked on several parts of luby
00:03:44.000 implementation and also i'm uh director
00:03:46.879 of luby association since
00:03:50.280 2012 and also uh an owner of the book at
00:03:54.799 second floor uh thank you for coming and
00:03:57.519 buying many copies of books
00:04:01.760 so before uh diving into the uh garbage
00:04:05.439 collection uh let me uh mention about my
00:04:08.080 proposal uh changing the lock api in
00:04:11.760 this
00:04:12.599 ticket so the idea is to introduce a new
00:04:16.400 communication primitive called lacta
00:04:18.320 port and duplicate the current lock and
00:04:21.840 lock take pair to communicate to each
00:04:24.759 other uh conceptually lock port is a
00:04:28.000 small enhancement of the mailbox concept
00:04:30.639 from the the actor model uh it provides
00:04:34.240 a cleaner clearer and more uh explicit
00:04:38.560 interface for uh for communication
00:04:41.520 between locks
00:04:44.440 so today's uh developers meeting so uh
00:04:48.880 the the today's morning session so there
00:04:51.759 there are many many compatibility
00:04:53.560 discussion with the ractor showing the
00:04:58.000 the warning so this uh this feature is
00:05:03.080 experimental feature so we can change
00:05:05.199 the api
00:05:08.039 easily so okay let me give uh you some
00:05:12.240 background on the garbage collection for
00:05:15.919 the
00:05:17.320 lers so l was introduced in ruby 3.0 a
00:05:21.840 new uh concurrent
00:05:25.000 model it was changed to uh enable two
00:05:28.800 things so at first the parallel
00:05:31.360 computing on ruby to take advantage of
00:05:35.120 parallel computers for faster ruby
00:05:38.600 applications second robust concurrent
00:05:42.520 programming so we as you know the sl
00:05:45.600 programming has many many issues so with
00:05:48.320 lactus isolated object spaces we provide
00:05:51.680 and we can avoid common concurren bugs
00:05:55.039 es especially those caused by shared mut
00:05:58.240 mutable uh
00:06:00.039 state so o overall recta gives us the
00:06:04.319 safer and faster way for concurrent
00:06:07.360 program in luby
00:06:11.280 so threads inside a single thre single
00:06:14.160 lockctor still need to acquire the gbl
00:06:17.759 which means the threads in locker don't
00:06:21.199 run in parallel and one thread in the
00:06:24.560 locker can long in lubric at the
00:06:29.479 time however each has its own gbl and
00:06:34.560 thread in different rockers can run in
00:06:37.520 parallel so on the parallel computers
00:06:40.560 such as the multiple cpu threads in
00:06:43.639 lockers one and lock two can be
00:06:46.639 scheduled independently and actually run
00:06:49.440 at the same time on different
00:06:54.280 cores each actor has uh in lubby has its
00:06:59.280 own isolated object space uh in
00:07:02.520 logically that means that every object
00:07:05.039 be belong to the exact one locker and it
00:07:08.400 cannot be accessed directly from other
00:07:11.360 lockers
00:07:14.360 basically so here uh here is a very
00:07:17.520 important rule about rockus so basically
00:07:20.720 you cannot access object in other rotors
00:07:23.840 this means that the lock one can't read
00:07:26.960 and write object from the
00:07:30.520 uh uh belong to lock two so this strict
00:07:35.360 isolation guarantees thread safety and
00:07:37.840 avoid bugs from shared mutable
00:07:41.240 states this is the foundation of for
00:07:45.120 robustness and safety
00:07:49.520 now if each doctor has its own isolated
00:07:52.240 object space then it seems natural to
00:07:55.680 learn garbage collection uh
00:07:57.800 independently or in parallel in each
00:08:01.560 ractor in the slide so each ractor has
00:08:05.120 its own
00:08:07.160 loot
00:08:09.240 loot and we can uh mark the uh all
00:08:13.280 object in each ractor and can uh garbage
00:08:16.479 collection independently and in
00:08:21.000 parallel so this idea what we call
00:08:24.560 laughter local garbage collection today
00:08:27.199 i want to
00:08:29.000 propose and on the surface it's very uh
00:08:33.120 it seems like very natural optimization
00:08:37.279 so it's not strange
00:08:39.800 one however so this is a problem so lux
00:08:44.600 can share certain object called sharable
00:08:49.160 objects so these include the classes and
00:08:52.720 modules closing objects that only
00:08:54.959 difference other shar objects and some
00:08:58.160 special classes such as the locker
00:09:00.160 objects themselves so in this example uh
00:09:04.399 both recta one and racta two are holding
00:09:06.959 references to the same sharable object
00:09:10.880 uh this uh lead
00:09:17.560 one and this is allowed and that to uh
00:09:22.320 totally fine this is allowed and so we
00:09:25.600 while most objects are isolated to
00:09:27.800 paractor shared objects act as
00:09:30.959 exceptions s as and can be referenced
00:09:34.720 safely from multiple
00:09:38.519 rockers now here is the key question so
00:09:42.080 if we allow each doctor to run garbage
00:09:45.440 collection
00:09:46.440 independently how should we deal with
00:09:48.959 sharable
00:09:50.440 objects so let's walk through this simp
00:09:53.760 uh simple example so step one l r r1
00:09:59.279 owns a sharable object
00:10:04.200 here and second the l two r2 has a
00:10:09.920 reference reference here this
00:10:13.800 one to the sh1 object and what when r1
00:10:19.279 learns its local garbage question it
00:10:21.600 doesn't seem reference to uh sh1 from
00:10:25.680 within r1 so r1 in r1 there is no
00:10:28.720 references to the uh sh1 so garbage
00:10:33.200 collector mistaken to free the sh1 but
00:10:38.399 we shouldn't free the sh1 because r2
00:10:42.079 needs this s sh1 object so this is a
00:10:45.600 problem
00:10:48.399 so with that background uh in mind so
00:10:51.279 let's move to the uh main challenges of
00:10:53.839 the uh implementing the lock local
00:10:56.839 gc so the first idea or ideal idea is to
00:11:03.360 track the reference to the sharable
00:11:05.279 objects across
00:11:06.839 lockers so if r1 knows so r1 knows the
00:11:11.040 ob sh1 is referenced from other lockers
00:11:14.640 then we can care about this reference
00:11:18.079 and the sh1 can be kept uh from
00:11:22.760 green so in this example uh sh1 is only
00:11:27.200 reachable from r2
00:11:32.000 so the challenge is how to maintain the
00:11:35.120 reference information such so the s sh1
00:11:38.399 is reference from the other
00:11:41.480 doctors so in theory uh tracking
00:11:44.640 reference between doctors uh sounds
00:11:46.720 reasonable but in practice it's very
00:11:52.279 difficult references can be updated at
00:11:55.360 any time and doctors learn in parallel
00:11:58.880 so this makes it hard to keep the
00:12:01.279 references uh reference state accurate
00:12:04.640 across all and if the tracking
00:12:08.079 information become out outdated or
00:12:10.639 incorrect the garbage collector might
00:12:13.440 miss a reference and accidentally free a
00:12:16.959 live
00:12:17.959 object so also from a performance
00:12:20.959 perspective tracking these uh interlock
00:12:23.920 references can introduce speed and
00:12:26.240 memory overheads
00:12:29.040 so let's talk about the tricky simple uh
00:12:32.320 sample scenario first uh imagine that
00:12:39.160 the sh1 a sharable object is sent to
00:12:42.959 from sent from ra one to l 2 and ra 2
00:12:46.880 holds reference to the sh1 and so we
00:12:51.920 send the objects so the the runtime
00:12:54.959 detect their reference move the
00:12:57.279 reference is uh passed to the lock two
00:13:03.440 so we can uh uh we can mark the s sh1 as
00:13:08.160 marked reference uh the reference to
00:13:11.440 level uh to the sh1 and that information
00:13:15.680 is now used to avoid the freeing sh1
00:13:18.800 during gc in
00:13:21.079 local gc in l one so there is no problem
00:13:26.600 now and now l to reference the uh
00:13:31.480 sh3 because we we can reach the sh3 from
00:13:37.880 sh1 but here is problem here is the
00:13:41.000 problem the difference tracking into uh
00:13:45.279 uh sorry reference tracking information
00:13:48.560 still says sh1 is referenced so not a
00:13:52.800 reference uh to the
00:13:56.040 sh3 and it hasn't been updated to uh
00:14:00.079 includes this new reference to sh3 so
00:14:04.079 now sh3 will marked concurren uh
00:14:07.360 correctly
00:14:08.519 because so in this case so the sh1 is
00:14:12.320 marked as a reference so uh the uh
00:14:16.480 traversing marking correctors
00:14:21.079 uh mark the this is uh sh1 sh2 sh3 is
00:14:27.279 alive living object and there is no
00:14:29.839 problem just now so the problem is if
00:14:33.920 the sh2 remove the link to the sh3 so
00:14:37.839 there is no uh information uh that the
00:14:41.519 sh3
00:14:43.560 is should not be uh
00:14:47.160 freed
00:14:48.920 so it means that the ractor loc at the
00:14:53.279 r1 uh sh3 can be freed and after that
00:14:58.240 the uh the if the doctor 2 uh try to use
00:15:04.000 the sh3 then it was freed and it
00:15:16.040 caus many many effort many many
00:15:18.560 overheads we can travel uh we can get a
00:15:21.519 reference from the object uh uh object
00:15:25.120 sh3 reference to the sh3 but
00:15:30.120 uh there are many many uh facts that we
00:15:34.240 cannot introduce such a uh mechanism so
00:15:37.440 this so this is a challenge so let's uh
00:15:40.639 not solve this
00:15:42.680 challenge so because uh solution one
00:15:46.399 seems difficult and we were using the
00:15:49.440 global object space just now so the
00:15:51.600 current master branch or ruby from ruby
00:15:54.320 3.0 to the ruby 3.4 and master branch uh
00:15:59.519 it uses a global object space so that
00:16:02.480 means that there is no power locked heap
00:16:05.360 all objects with uh whether created in
00:16:08.880 r1 or r2 are managed
00:16:12.759 together when we need to learn the
00:16:15.440 garbage collector ruby has to stop all
00:16:19.000 reacts and then traverse from all roots
00:16:22.560 mark live object this is accurate no
00:16:26.800 problem but it's never miss uh but it is
00:16:30.720 slow because we need to stop the all of
00:16:33.519 the object and all of the doctors and we
00:16:36.079 need to uh traverse all of the object
00:16:38.800 between doctors and also we can't
00:16:41.440 utilize the uh
00:16:45.720 parism so this compares the timeline for
00:16:49.199 of the two gc uh strategies so solution
00:16:54.680 one local gc and
00:16:57.480 uh solution two is the current global
00:17:01.279 garbage collector so in local gc each
00:17:05.280 runs uh garbage collector independently
00:17:08.000 and that allows gc uh to learn in
00:17:11.319 parallel but as we uh we've seen it's
00:17:15.199 difficult to
00:17:16.679 implement in contrast the current
00:17:19.360 garbage global garbage collection uh
00:17:22.319 solution
00:17:26.839 too so all lockers one uh if one
00:17:32.640 uh requests a pose then the runtime wait
00:17:36.240 until the all ractor are stopped and
00:17:39.360 runs garbage collector on the shared
00:17:42.919 heap this is
00:17:45.080 simple and space efficient because free
00:17:49.360 slots can be shared between lockers
00:17:52.960 uh but unfortunately it's
00:17:56.760 slow and it block uh parallel solution
00:18:00.880 uh ex
00:18:02.840 execution so our goal is that we want to
00:18:06.400 introduce the local gc to improve the
00:18:08.400 performance and round garbage collection
00:18:10.240 in parallel and the independently for
00:18:13.919 each lecture
00:18:16.480 uh like how it's doing in alon or elixir
00:18:22.440 langages but the big uh challenge is
00:18:26.720 shadow objects and it difficult to track
00:18:29.360 reference that goes across
00:18:33.480 lus and that's exactly why lactishi has
00:18:38.240 been missing for
00:18:41.240 years it's very hard to get it right so
00:18:44.880 the core challenge here is how do we
00:18:48.559 design
00:18:50.360 uh safe and practical algorithm that
00:18:54.240 enable rato logos without breaking
00:18:57.360 anything
00:19:00.000 so now that uh we've seen the challenge
00:19:02.640 and how do we safely implement lock gc
00:19:06.320 especially with shadow objects so let's
00:19:08.960 dive
00:19:10.520 in our proposion is simple and
00:19:13.600 conservative one so we do not try to
00:19:16.799 detect the reference liess of sharing
00:19:19.760 objects
00:19:21.280 so it means that we assume that all of
00:19:24.000 the sharable objects marked as sharable
00:19:26.559 object are
00:19:28.360 living while uh local garbage
00:19:31.799 collection because it's really hard to
00:19:34.880 track close to refine safely so instead
00:19:38.799 we assume the share of objects are
00:19:41.600 always arriving during local gc so this
00:19:44.880 work because this works this algorithm
00:19:48.240 works because the number of sharable
00:19:51.440 object is in small and we treat them as
00:19:56.320 root object during lock local gc then
00:19:59.520 the two correct the sharable object when
00:20:03.320 needed we fall back to the global gc so
00:20:06.640 we already have a global gc so we can
00:20:09.120 reuse this one
00:20:16.400 so
00:20:17.640 which ps all rocks and checks every
00:20:21.520 references as i said so it is slow but
00:20:24.799 we can learn global gc less frequently
00:20:27.280 than current one
00:20:31.000 so this this is why i propose this one
00:20:35.159 uh local gc will be enough i think
00:20:40.960 so this figure shows how to share
00:20:43.600 sharable object for each
00:20:46.280 doctor we maintain set of we uh we
00:20:50.320 maintain a set of sharable objects so
00:20:53.760 during local garbage collection we treat
00:20:56.720 these as root objects and mark them as
00:21:00.000 if they are always
00:21:02.520 alive so this is accurate we never miss
00:21:06.159 live references but there there is a
00:21:09.520 trade-off so if sharable object uh no
00:21:12.720 longer reachable uh won't be corrected
00:21:16.080 until global gc
00:21:18.200 happens so that means that uh memory for
00:21:22.799 uh such objects may stay uh around
00:21:26.120 longer but the benefit is we can safely
00:21:29.760 run this in parallel
00:21:33.080 paractor so in practice most of sharable
00:21:36.640 object are long lived for
00:21:38.840 example classes modules methods and bite
00:21:43.360 codes are of course long living
00:21:48.760 objects and ra objects themselves also
00:21:51.919 tend to live long uh longer so maybe uh
00:21:56.640 naturally the the most of shadow objects
00:22:00.000 will be long lived objects
00:22:03.960 so we can uh so we decided to uh employ
00:22:09.120 this
00:22:11.559 idea so this is proposed timeline with
00:22:14.960 our uh conservative
00:22:17.400 approaches so sometime so most of case
00:22:21.280 so we learn the local gc independently
00:22:24.240 in parallel and uh infrequently we you
00:22:27.520 we need to use the global
00:22:30.919 gc from uh from this work we uh from
00:22:35.440 this work we learn key insight so done
00:22:39.520 is better than perfect so we can't
00:22:43.760 introduce the lock local gc because
00:22:46.000 solution one is can't be uh uh completed
00:22:49.679 so we we discussed several years
00:22:53.640 but so this our the today's proposal is
00:22:58.720 not the best but it improves the
00:23:00.880 performance so we need to uh introduce
00:23:03.919 this is it fast and after that we can
00:23:07.280 discuss solution
00:23:09.400 too so this is my study of uh what i
00:23:13.760 learned from this work
00:23:18.000 so i implemented the local gc with these
00:23:21.799 steps but let's skip details because
00:23:24.960 there is no time to uh talk enough so
00:23:29.280 uh so i i do many many things to do
00:23:33.000 that and then we have uh many techniques
00:23:37.280 to uh improve the performance but i also
00:23:40.240 skip this slide so i will publish this
00:23:42.720 slide so if you have interest please
00:23:45.360 check it or ask me
00:23:47.480 later so from here i'd like to talk
00:23:50.720 about the evaluation we measures the
00:23:53.280 performance of our proposal through the
00:23:56.760 microbenchmarks comparing it against the
00:23:59.360 current lub implementation so here is
00:24:02.240 the benchmark u environments i use 40 14
00:24:07.760 cores with six performance core core and
00:24:11.200 eight deficient cores of inter
00:24:14.360 cpu for microbenchmarks we repeat simple
00:24:17.760 task end times across ractors so each
00:24:20.640 ractor pro performs tn tasks where tn
00:24:24.799 defined as n divided by the number of
00:24:27.480 ractors the benchmark script is
00:24:29.760 available this uh g
00:24:32.520 page so first of all we show the uh
00:24:36.799 simplest one so each repeat allocate
00:24:40.440 show live string objects so as shown in
00:24:44.880 code snippets here
00:24:47.799 here so it just create and discard empty
00:24:51.520 string tn times on each rocks on the
00:24:55.279 x-axis the n it is the number of rocks
00:24:58.960 and the y-axis uh we have ex we have
00:25:03.200 execution time in seconds so lower is
00:25:07.080 better and the blue line shows the
00:25:09.760 current uh lubby master and the orange
00:25:12.400 line is local distribution you can
00:25:14.960 clearly see that with more actors our
00:25:18.320 version scale scales better so this is
00:25:22.240 same benchmark as before but we show
00:25:24.799 looking at the speed up ratio compared
00:25:27.520 with the uh single l execution time so
00:25:30.799 higher is better here as you can see
00:25:33.919 those the orange lines for our modified
00:25:37.279 version scales quite well achieving
00:25:39.840 nearly five times speed up with 16 lers
00:25:44.480 in contrast the master version shows
00:25:46.960 almost no parallel benefits that's
00:25:49.520 because all are stopped at for global gc
00:25:52.960 eliminating the the advantage of par
00:25:55.760 execution of ractors this result shows
00:25:58.880 that the local dish work effectively for
00:26:02.720 local short uh local
00:26:06.039 short with shortlived
00:26:09.559 objects so next uh example is making a
00:26:15.440 long live long long lived object so this
00:26:19.279 cause the uh make a many array but al
00:26:23.919 doesn't read
00:26:28.640 so so the master version uh shows the
00:26:33.120 career bottleneck so as we increase the
00:26:36.480 number of rockers the execution becomes
00:26:39.559 worse even if it's long in parallel in
00:26:43.760 contrast our local gc version handles
00:26:47.360 this very
00:26:48.919 efficiently so this is a speed up ratio
00:26:51.760 of the the last benchmark so higher is
00:26:55.200 better for same benchmark again master
00:26:58.159 branch doesn't scale performance
00:27:00.799 actually get worse with more octas
00:27:04.000 that's because every
00:27:06.480 uh contribute to triggering global gc
00:27:09.520 more frequently in contrast the local gc
00:27:15.400 version local dish version uh uh
00:27:20.240 continue to scale achieving over four
00:27:22.640 times speed up with 16 rafters
00:27:26.640 this result shows how local helped long
00:27:29.520 with locked local
00:27:32.039 objects here is a benchmark long in
00:27:35.279 regular simple regular expression
00:27:38.039 matching and it also shows a good
00:27:41.200 performance before uh compared with uh
00:27:44.480 the the master one so this shows the
00:27:47.840 speed up pressure and orangey one uh is
00:27:51.120 modified version so higher is better and
00:27:53.600 the uh the our local gc shows a good
00:27:57.600 performance than the current
00:27:59.320 one so it includes the regular
00:28:01.919 expression execution time so it is not
00:28:05.080 uh uh better than the last two uh
00:28:09.520 examples but it also have good
00:28:12.880 performance compared with current one so
00:28:16.399 let's bring up the today's topic so we
00:28:21.440 want to uh we need to we need to
00:28:24.399 complete the implementation for lub 3.5
00:28:28.000 to introduce the doctor local gc uh so
00:28:32.000 we have many tasks and also we want to
00:28:34.640 make a parallel marking on the global gc
00:28:37.279 so we have several uh cores so we can
00:28:40.559 utilize them and also so after after
00:28:44.640 that we we we need uh we want to uh
00:28:48.399 introduce the difficult solution one for
00:28:51.520 more performance so l said he might have
00:28:55.279 a good idea and he implemented it and he
00:28:59.600 says that he has a good implementation
00:29:01.760 for that so we can wait for his good
00:29:04.720 great news after that
00:29:08.000 so this is a summary of this uh
00:29:10.799 presentation so i introduce why the
00:29:13.559 lagishi is difficult and i propose uh
00:29:18.000 conservative approach and it can intro
00:29:21.279 uh improve the performance on the
00:29:23.600 microbenchmark
00:29:25.320 significantly
00:29:27.080 so question is so address the this work
00:29:31.679 is collaboration with lo man and so
00:29:35.039 thank you for listening so question and
00:29:37.440 are very welcome so please ask me later
00:29:40.399 or at the uh stores booth so thank you
00:29:43.440 so much
Explore all talks recorded at RubyKaigi 2025
+66