Summarized using AI

Modular Garbage Collectors in Ruby

Peter Zhu • April 18, 2025 • Matsuyama, Ehime, Japan • Talk

Overview of Modular Garbage Collectors in Ruby 3.4

In this talk, Peter Zhu from the Ruby core team presents an overview of the new Modular Garbage Collector (GC) feature introduced in Ruby 3.4. This experimental API allows developers to implement various garbage collection strategies, significantly enhancing Ruby's memory management capabilities.

Key Points Discussed:

  • Current Garbage Collection Mechanism: The existing garbage collector in Ruby is a mark-sweep-compact collector, effective but lacking modern optimizations such as parallelism and alternative algorithms.
  • Need for New Implementations: There are many existing algorithms (like reference counting and semi-space copying) that can improve performance, but they are not supported by Ruby's current implementation due to design constraints.
  • Introducing Modular GC: The Modular GC introduced in Ruby 3.4 allows the integration of alternative garbage collectors. This modularity helps avoid the drawbacks of a single, monolithic garbage collector by enabling developers to select the GC that best fits their application's needs.
  • Implementation Details: Peter explains how the modular GC API is structured, including boot-up functions, memory allocation, and the lifecycle of objects managed by the GC. This facilitates communication between Ruby and the GC implementations, which can now be loaded as shared libraries.
  • Memory Management Toolkit (MMTk): The talk highlights MMTk, a robust framework for garbage collection that ruby developers can integrate into their applications. MMTk offers various garbage collection algorithms that are adaptable depending on application needs.
  • Future Roadmap: Peter discusses the ongoing development of both the Modular GC and MMTk integration, noting the aim to support features such as generational collection and parallelism in future releases. This includes developing plans that would allow Ruby to benefit from multi-core processors.

Concluding Remarks:

Peter concludes that while Ruby's traditional garbage collector has served well, the introduction of modularity allows for flexibility and performance improvements in memory management. Developers are encouraged to explore and create their own implementations using the experimental API, keeping in mind that it is still subject to change.

This talk is a valuable resource for Ruby developers interested in optimizing memory management through new techniques and tools, aligning Ruby with modern software demands.

Modular Garbage Collectors in Ruby
Peter Zhu • Matsuyama, Ehime, Japan • Talk

Date: April 18, 2025
Published: May 27, 2025
Announced: unknown

Introduced in Feature #20470, Ruby 3.4 ships with an experimental API for implementing garbage collectors.

In addition to the built-in garbage collector, Ruby 3.4 also ships with an experimental garbage collector implemented using the Memory Management Toolkit (MMTk) framework. MMTk provides a wide variety of advanced garbage collector implementations such as Immix and LXR.

In this talk, we will introduce the Modular GC API, look at how MMTk is implemented using this API, discuss our current progress and future roadmap, and how you can implement your own garbage collector using this API.

https://rubykaigi.org/2025/presentations/peterzhu2118.html

RubyKaigi 2025

00:00:04.400 hello Ruby Kai my name is Peter and I'm
00:00:08.480 on the Ruby core team and I'm a senior
00:00:10.719 developer at Shopify today I'll be
00:00:13.360 talking about modular garbage collectors
00:00:15.679 which is a experimental feature in Ruby
00:00:19.000 3.4 you can get a copy of these slides
00:00:22.000 by scanning this QR code or by following
00:00:24.400 this URL don't worry if you end up
00:00:26.640 missing this QR code because I'll also
00:00:28.480 show this QR code at the end of this
00:00:30.920 presentation so a little bit about me
00:00:33.280 first my name is Peter i'm currently
00:00:35.280 based in Toronto Canada i'm on the Ruby
00:00:38.239 core team and I'm a senior developer at
00:00:40.320 Shopify on the Ruby infrastructure team
00:00:43.040 where I work on performance and memory
00:00:44.800 management in Ruby i'm the co-author of
00:00:47.600 the variable with allocation in Ruby
00:00:49.840 which improves performance and memory
00:00:51.520 efficiency of the Ruby garbage collector
00:00:54.640 i'm also the co-author of the Ruby free
00:00:57.440 at exit feature which uh frees memory at
00:01:00.320 shutdown to allow the use of memory leak
00:01:02.559 checkers such as valgrind or Mac OS
00:01:04.640 leaks tool i'm also um the author of the
00:01:08.880 Ruby mem and autotuner gems and in my
00:01:11.360 free time I like to travel and take
00:01:13.280 photos and I post them on Instagram at
00:01:16.840 peterzoo.phos so I'm going to give a
00:01:19.280 brief overview of this talk first so
00:01:21.759 first I'll be talking a little bit about
00:01:23.600 Ruby's mark sweep compact garbage
00:01:25.759 collector then I'll be talking about the
00:01:27.920 missing features from this garbage
00:01:29.759 collector and the motivation of
00:01:31.360 introducing alternative garbage
00:01:33.119 collector
00:01:34.119 implementations next we'll be taking a
00:01:36.640 look at the modular garbage collector
00:01:38.320 feature in Ruby 3.4 which allows us to
00:01:40.799 use alternate garbage collector
00:01:42.479 implementations inside of Ruby we'll be
00:01:45.520 taking a look at how the garbage
00:01:47.520 collector communicates with the Ruby
00:01:49.040 virtual machine using the modular GC uh
00:01:51.520 API and briefly look at how to build
00:01:54.240 Ruby with modular GC enabled and then uh
00:01:57.439 we'll take a look at how to build the
00:01:59.040 two garbage collector implementations
00:02:00.960 that ship with
00:02:02.680 Ruby and finally we'll be taking a look
00:02:05.200 at MMTK which stands for the memory
00:02:07.520 management toolkit this is the first
00:02:09.759 user of the modular GC that is not the
00:02:12.640 built-in GC in Ruby we'll look at the uh
00:02:16.800 the features that MMTK supports and the
00:02:19.520 implementation of MMTK through the
00:02:21.920 modular GC API and then we'll take a
00:02:24.959 look at the future road map of MMTK and
00:02:27.200 modular GC let's first take a a quick
00:02:31.040 look at what a garbage collector is so
00:02:33.840 at a high level the garbage collector is
00:02:35.760 responsible for three things it first
00:02:38.000 performs object allocation when you
00:02:40.000 create a new object for example a hash
00:02:42.480 or an array or a class uh Ruby requests
00:02:45.519 a piece of memory from the garbage
00:02:47.280 collector and then during the allocation
00:02:49.840 of the object the garbage collector can
00:02:52.000 choose to run a garbage collection cycle
00:02:54.160 to free up some of that memory so it is
00:02:56.959 now responsible for determining which
00:02:58.800 objects are alive and which objects are
00:03:00.720 dead there are many techniques for doing
00:03:03.040 this and Ruby does this using tracing uh
00:03:05.760 by tracing object references from root
00:03:08.440 objects root objects are things such as
00:03:11.200 global variables and constants inside of
00:03:13.360 Ruby after determining the liveless of
00:03:16.319 objects uh the garbage collector
00:03:18.480 reclaims objects that have died and this
00:03:21.120 includes reclaiming resources of objects
00:03:24.360 uh resources of the object such as
00:03:27.440 memory allocated from the system closing
00:03:29.760 file descriptors and running finalizers
00:03:32.400 and once uh the the object has been
00:03:34.879 reclaimed by the garbage collector this
00:03:36.560 memory can be reused by the garbage
00:03:38.239 collector for a new
00:03:39.959 object so to allocate object Ruby uses a
00:03:43.480 freelocator and uh how the free list
00:03:46.080 allocator works is very simple it's just
00:03:48.000 a link list of empty memory that we can
00:03:50.400 allocate into for example here we have a
00:03:53.280 Ruby heap with space for six objects we
00:03:55.599 have three objects that are allocated
00:03:57.200 here which are of type string array and
00:04:00.000 regular expression and we have three
00:04:01.920 empty slots here that we can allocate
00:04:03.519 objects into for the free list allocator
00:04:06.000 we build a linked list of slots just
00:04:08.319 like this and then when we need to
00:04:10.319 allocate an object and uh it it
00:04:13.200 allocates into the head of the free list
00:04:14.959 and we move the free list to the next
00:04:16.720 element and then we can repeat this
00:04:18.880 until all of the slots have been filled
00:04:22.199 up and then once all the slots have been
00:04:24.639 filled up what do we do we run a garbage
00:04:26.880 collection cycle to reclaim any dead
00:04:29.440 objects ruby uses the mark sweep compact
00:04:32.320 garbage collector which as the name
00:04:33.919 implies has three phases first we have
00:04:36.240 the mark phase which determines which
00:04:37.919 objects are alive the sweep phase which
00:04:40.160 determines of which reclaims all of the
00:04:41.840 dead objects and the optional compact
00:04:43.759 phase which moves objects so we don't
00:04:45.840 have holes or fragmentation in the
00:04:48.120 memory so now I'll show the marking
00:04:50.479 phase using an example so here uh we
00:04:53.280 have an example heap uh from the
00:04:55.360 previous example and let's add object
00:04:57.759 references here where arrows indicate
00:04:59.840 that a particular object refers to
00:05:01.520 another object and we have the root
00:05:03.759 objects here which include things like
00:05:05.360 constants and Ruby objects on the stack
00:05:07.840 so we start marking from the roots and
00:05:09.759 the objects that have been marked but
00:05:11.520 not yet traversed are marked gray so
00:05:14.000 let's pick the gray object here and so
00:05:16.560 here let's pick the array uh and let's
00:05:18.720 traverse the children of the array and
00:05:20.400 we mark this array as black and we mark
00:05:23.120 the children of the array as gray which
00:05:25.520 is the object in this case the class
00:05:27.919 here is already marked gray so there's
00:05:29.840 nothing we need to do and then let's
00:05:31.680 pick the next gray object which is the
00:05:33.360 class and since this class refers to no
00:05:35.600 objects there's nothing we need to do
00:05:37.680 and now let's mark the gray object black
00:05:39.680 and mark its children which is the
00:05:41.680 string and then we mark the last gray
00:05:43.680 object black and since this object has
00:05:46.400 no uh children there's nothing we need
00:05:48.160 to mark and since we no longer have any
00:05:50.560 more gray objects uh this marking phase
00:05:53.120 has is now complete um and so after the
00:05:56.400 marking phase uh all of the black
00:05:58.160 objects are alive and all of the white
00:06:00.080 objects are dead and can be reclaimed by
00:06:02.000 the garbage collector so let's go back
00:06:04.319 to this example uh so for the marking
00:06:06.720 phase we scan the heap and we reclaim
00:06:09.600 any white objects and we have two here
00:06:12.000 so the garbage collector will also
00:06:13.759 reclaim any additional resources of the
00:06:15.680 object such as memory allocated from the
00:06:17.360 system or open file descriptors
00:06:19.600 compaction is an optional phase of
00:06:21.759 garbage collection that can be manually
00:06:23.600 enabled it moves objects around the heap
00:06:26.400 so that all of the objects are towards
00:06:28.000 the one end of the heap and this can
00:06:29.840 help us save memory because it prevents
00:06:31.680 issues known as internal fragmentation
00:06:34.560 which is where chunks of unused me uh
00:06:36.800 unused memory is interled with uh used
00:06:39.919 memory and prevents memory from being
00:06:41.840 freed which increases memory usage ruby
00:06:44.880 uses the two-finger algorithm for
00:06:46.720 compaction and let's see how that works
00:06:48.800 using the example heap and as the name
00:06:50.960 implies there are two fingers or cursors
00:06:53.680 so from the left we have the free cursor
00:06:56.560 this cursor moves from the left to the
00:06:58.400 right until the first empty slot so here
00:07:02.080 it moves to the right till the first
00:07:03.280 empty slot and then from the right we
00:07:04.800 have the scan cursor which points to the
00:07:07.120 first live object in the heap then we
00:07:10.479 swap the objects at these two locations
00:07:12.960 and we leave a temporary object at the
00:07:14.960 original location known as a forwarding
00:07:16.880 reference and we'll see how this is used
00:07:18.800 in the next phase of compaction we then
00:07:21.440 move the free cursor to the right onto
00:07:23.199 the next free slot and then the scan
00:07:25.039 cursor to the left to the next live
00:07:26.800 object but since the two cursors have
00:07:28.880 now met uh this phase of compaction is
00:07:31.039 complete we then enter the reference
00:07:33.199 updating phase uh and here we modify
00:07:36.319 references from objects that refer to a
00:07:38.319 t-moved object uh so let's add the
00:07:41.039 object references back we now linearly
00:07:44.080 scan every object and find objects that
00:07:46.319 refer to moved objects we can see that
00:07:48.880 this array here points to a troved
00:07:50.960 object and we can update this reference
00:07:53.120 to point to the new location for the
00:07:55.199 object and then we can continue scanning
00:07:57.440 the remaining objects and then uh once
00:08:00.160 we finish that we can reclaim all of the
00:08:02.560 t-moved
00:08:03.800 objects so that was a brief and
00:08:06.479 simplified introduction to how the mark
00:08:08.800 sweep compact garbage collector works in
00:08:11.120 Ruby there are many optimizations and
00:08:13.840 implementation details that were omitted
00:08:15.919 from this introduction so I have a blog
00:08:18.639 post here linked that describes how the
00:08:21.039 garbage collector works in much greater
00:08:23.120 detail and please check it out after the
00:08:25.280 talk if you're
00:08:26.520 interested so we already have a garbage
00:08:29.120 collector in Ruby that's been
00:08:30.319 continuously improved upon for the past
00:08:32.640 30 years it works and provides decent
00:08:35.440 performance so now let's take a look at
00:08:37.599 some of the features that Ruby that the
00:08:39.760 Ruby garbage collector is
00:08:41.640 missing so Ruby's garbage collector uses
00:08:44.320 the mark sweep compact algorithm which
00:08:46.160 is a fairly simple algorithm to
00:08:48.080 implement and understand but there are
00:08:50.399 many other garbage collection algorithms
00:08:52.320 that can provide higher performance so
00:08:55.440 one garbage collection algorithm that
00:08:57.279 are used that is used by some languages
00:08:58.959 such as Python is called reference
00:09:01.120 counting reference counting keeps track
00:09:03.440 of the references from one object to
00:09:05.600 another and so when it detects that an
00:09:07.760 object no longer has references to it it
00:09:09.920 is dead and can be reclaimed by the
00:09:11.440 garbage
00:09:12.360 collector we can almost use this in Ruby
00:09:15.519 however this requires that we can
00:09:17.600 accurately keep track of when references
00:09:20.000 from an object is added and removed this
00:09:23.120 is implemented in Ruby using a concept
00:09:25.120 called write barriers these are
00:09:27.120 callbacks to the garbage collector when
00:09:28.959 an object adds a a reference to another
00:09:32.040 object however the implementation of
00:09:34.720 rightbearers is incomplete in Ruby
00:09:37.120 because it was not designed with
00:09:38.560 reference counting in mind for example
00:09:41.200 there are many cases where removing a
00:09:43.519 reference from an object does not fire
00:09:45.519 the right barrier which would make
00:09:46.880 reference counting not work properly you
00:09:49.680 will notice this mo many times
00:09:51.440 throughout this talk by having the same
00:09:54.080 garbage collector algorithm for the past
00:09:56.480 30 years means that there are there are
00:09:58.880 ingrained assumptions about how the
00:10:00.640 garbage collector works in the Ruby VM
00:10:03.360 and this prevents flexibility in the
00:10:05.360 garbage collection algorithm that we can
00:10:07.519 use so right now we cannot use reference
00:10:10.240 counting in
00:10:11.959 Ruby semiece copying garbage collector
00:10:15.040 has two regions of memory which are
00:10:16.640 called the from space and the two space
00:10:18.640 we allocate in the from space until we
00:10:21.040 ran out of memory and then we run a
00:10:22.720 garbage collection cycle by moving all
00:10:24.959 of the live objects from the from space
00:10:26.720 to the two space and let's see a quick
00:10:28.880 example of how this works so we allocate
00:10:32.160 uh objects in the from space using a
00:10:33.760 bump pointer allocator which means that
00:10:36.079 we allocate an allocate objects from the
00:10:38.560 top and move the pointer forward and
00:10:40.720 then when we fill up the from space it
00:10:43.120 is time to run a garbage collection
00:10:44.560 cycle to free up uh space and uh we
00:10:48.079 let's mark the live objects as black
00:10:50.240 here we then move the live objects to
00:10:52.880 the twospace and then we can reclaim all
00:10:55.360 of the memory from the from space since
00:10:58.079 we know that all of the objects in the
00:10:59.760 from space are dead we then swap the
00:11:02.480 from space and the two space which
00:11:04.000 guarantees that we always allocate
00:11:05.519 objects into the from space and so it
00:11:08.160 should be clear what the advantages of
00:11:10.399 this algorithm is so every garbage
00:11:13.120 collection cycle defragments the heap uh
00:11:15.920 so it has compaction built in however
00:11:19.040 this ends up using twice the amount of
00:11:21.200 memory and this is definitely a case
00:11:23.200 where we trade performance for higher
00:11:24.959 memory usage however again we cannot use
00:11:28.000 this algorithm in Ruby because Ruby
00:11:30.480 requires that the GC algorithm supports
00:11:32.560 pinning which is where uh certain
00:11:34.720 objects cannot be moved but semi-pace
00:11:37.279 copying requires that all live objects
00:11:39.519 must be able to move so we cannot use
00:11:41.839 ref uh semi-pace copying right now
00:11:45.000 imix is a garbage collection algorithm
00:11:47.519 that combines ideas from mark and sweep
00:11:49.519 and semi-space copying imix splits the
00:11:52.000 heap into many blocks and using
00:11:54.160 heristics it determines which blocks are
00:11:56.240 most likely to be fragmented then during
00:11:58.480 marking it moves live objects in blocks
00:12:00.720 that it determines to be the most
00:12:02.040 fragmented this gives us the benefits of
00:12:04.399 semispace copying because it moves
00:12:06.320 objects during marking and allows for a
00:12:08.160 pinning support because not uh because
00:12:10.240 not all objects are required to move so
00:12:12.480 we can use this algorithm in Ruby
00:12:15.440 another feature missing from Ruby's
00:12:17.200 garbage collector is parallelism as we
00:12:20.240 probably know uh with the exception of
00:12:22.480 reactors Ruby does not support
00:12:24.600 parallelism so multiple threads in Ruby
00:12:27.200 don't run at the same time this
00:12:29.040 simplifies the implementation and can
00:12:30.800 prevent race conditions however the CPUs
00:12:33.600 in modern machines have uh multiple CPU
00:12:36.320 cores and we are not taking advantage of
00:12:38.399 that also if you were in Kohici Sasada's
00:12:41.279 talk earlier today about Ractor local GC
00:12:44.560 you might know that the GC is currently
00:12:46.639 shared between ractors and um since and
00:12:50.880 and because of that uh rack uh when we
00:12:53.519 run a garbage collection cycle we need
00:12:55.200 to pause all of the ractors so it ends
00:12:57.040 up becoming a bottleneck so by making
00:12:59.600 the garbage collector parallel we can
00:13:01.600 improve performance for Ruby
00:13:03.120 applications and make it less of a
00:13:04.959 bottleneck for ractors
00:13:07.279 for the longest time every object in
00:13:09.279 Ruby was exactly 40 bytes in size any
00:13:11.920 additional memory was allocated through
00:13:14.320 the system using Maloc this was done for
00:13:16.800 simplicity because it avoided internal
00:13:18.880 fragmentation however it was inefficient
00:13:21.360 for memory usage uh because uh there was
00:13:24.000 additional space required to store
00:13:25.600 pointers to other regions of memory and
00:13:27.680 it was also bad for performance because
00:13:29.839 it required more memory reads to alloc
00:13:31.839 to uh to to uh read to access the data
00:13:34.959 for the particular object
00:13:37.360 so in Ruby 3.2 I co-authored variable
00:13:39.839 with allocation which allowed the
00:13:41.519 garbage collector to allocate dynamic
00:13:43.760 slot sizes however due to technical
00:13:46.240 limitations we are limited to five slot
00:13:48.560 sizes which are powers of two multiples
00:13:50.959 of 40 bytes which this means that the
00:13:53.360 slot sizes are 40 bytes 80 bytes 160
00:13:56.399 bytes 320 bytes and 640 bytes any
00:13:59.360 objects that require more than 640 bytes
00:14:01.360 in space would have that additional uh
00:14:03.760 memory uh allocated from the system via
00:14:06.040 maloc uh through this feature we saw 2
00:14:08.639 to 10 perfor% performance improvements
00:14:10.800 in various
00:14:12.519 benchmarks other garbage collectors are
00:14:14.959 able to implement truly dynamic slot
00:14:17.040 sizes without the slot size limitation
00:14:18.880 that variable width allocation has so
00:14:21.440 then this will allow us to be able to
00:14:23.360 further improve per performance and
00:14:25.279 decrease memory usage
00:14:28.000 so now we've seen some of the reasons
00:14:30.720 why using a different garbage collector
00:14:32.880 implementation could improve performance
00:14:34.720 for Ruby so you might be asking why we
00:14:37.120 don't just rewrite Ruby's garbage
00:14:39.000 collector so we chose to not replace
00:14:41.839 Ruby's current garbage collector for
00:14:43.600 several reasons all of the garbage
00:14:45.920 collectors are a trade-off between
00:14:47.600 performance memory usage and complexity
00:14:50.560 ruby's mark and sweep garbage collector
00:14:52.720 is great at reducing memory usage and
00:14:55.040 has relatively low complexity ruby
00:14:57.920 programs range from small scripts that
00:14:59.680 take milliseconds to run to web
00:15:01.600 application that serve a huge amount of
00:15:03.600 requests so there's really no garbage
00:15:05.760 collector that is one
00:15:07.639 sizefits-all ruby's current garbage
00:15:09.600 collector is also very battle tested and
00:15:11.760 provides a good balance of performance
00:15:13.839 memory usage and stability so we don't
00:15:16.399 want to replace it and also if we did
00:15:19.040 replace the Ruby garbage collector and
00:15:21.199 then in the future we wanted to switch
00:15:23.120 to a even higher performing garbage
00:15:25.199 collector we would need to repeat this
00:15:27.120 process all over
00:15:28.839 again so to solve this problem we are
00:15:31.519 working on modular garbage collectors
00:15:33.680 this allows the user to switch garbage
00:15:35.920 collector implementations depending on
00:15:37.760 their requirements and workload this was
00:15:40.320 released as an experimental feature in
00:15:42.639 Ruby 3.4 for and implements an API to
00:15:45.680 implement alternate garbage collectors
00:15:47.839 and a mechanism to load them into Ruby
00:15:50.880 this feature was introduced in ticket
00:15:53.000 20470 and you can read it if you're
00:15:55.120 interested in more details the first
00:15:57.839 part of this feature involved splitting
00:15:59.680 Ruby's garbage collector into two parts
00:16:02.160 previously Ruby's garbage collector
00:16:03.839 lived in a file called GC c which was
00:16:06.000 compiled into the Ruby binary so we
00:16:08.959 first split the this file into two the
00:16:11.279 existing GC.c C file no longer contains
00:16:13.839 any garbage collector implementation
00:16:16.079 rather it only interfaces the Ruby VM
00:16:18.959 for the GC and so this does things like
00:16:22.240 start and stop reactors for running
00:16:24.320 garbage collection cycles this new file
00:16:26.959 which is under the garb under the GC
00:16:28.959 directory called default C contains the
00:16:31.600 actual implementation of the mark and
00:16:34.079 sweep GC that is in Ruby this allows us
00:16:37.519 to build the default GC separately from
00:16:40.240 the Ruby binary and then the GC uh
00:16:43.120 implementation is loaded into Ruby as a
00:16:45.519 shared library similar to how native
00:16:47.839 extensions are loaded and then the
00:16:49.920 communication between the default GC and
00:16:52.000 Ruby is done through the modular GC
00:16:54.680 API the GC implementation communicates
00:16:57.440 with Ruby through the GC API this API
00:17:00.240 contains two parts the API that the GC
00:17:02.560 implementation is required to implement
00:17:04.880 and the API that the GC implementation
00:17:06.959 is allowed to use in
00:17:08.679 Ruby this header files contains the
00:17:11.120 function prototypes for all of the
00:17:12.720 functions that the GC implementation is
00:17:14.559 required to implement this file lives
00:17:16.799 under
00:17:18.280 GC/GCIPOLE.h and there are about four
00:17:20.480 about 60 functions split across 14
00:17:22.559 sections and let's take a look at some
00:17:24.400 of these in greater detail so the first
00:17:26.559 section are the bootup functions these
00:17:28.400 are ran a boot to allocate data
00:17:30.000 structures for the garbage collector and
00:17:31.600 do and perform setup there are many
00:17:33.840 functions here because there are various
00:17:35.440 bootup phases in Ruby so we have object
00:17:38.000 space allocation which allocates the
00:17:39.919 data structures for the garbage
00:17:41.200 collector and this is ran before the VM
00:17:43.600 has booted because we need a functional
00:17:45.679 garbage collector for the VM to boot uh
00:17:48.400 we also have functions to allocate data
00:17:50.320 structures for ractors and another
00:17:52.320 initializer function that is ran after
00:17:54.480 the Ruby VM has booted which allows the
00:17:56.799 GC to define methods and do other
00:17:59.120 operations that require a functional VM
00:18:01.280 to work we also have shutdown functions
00:18:03.919 which uh performs cleanup and freeze
00:18:06.080 data structures in the GC we also have
00:18:08.640 functions involving uh running a GC
00:18:11.200 cycle such as manually running a GC uh
00:18:14.000 disabling and enabling the GC and
00:18:16.320 setting configuration we have object
00:18:18.400 allocation which asks asks the GC for an
00:18:21.280 object of a specific size and then the
00:18:23.360 GC returns a pointer to the object that
00:18:25.360 was allocated and we have functions for
00:18:27.440 the GC to mark an object as as alive and
00:18:30.080 to pin an object to ensure that it does
00:18:31.760 not move and we also have write barriers
00:18:33.919 which are callbacks so that the GC know
00:18:35.760 when an object has a new reference to
00:18:37.919 another
00:18:38.919 object so that was the API that the GC
00:18:41.679 implementation is required to implement
00:18:44.080 so to ensure that the GC implementation
00:18:46.160 has abstraction from Ruby internals the
00:18:49.039 GC implementation is only required is
00:18:51.679 only allowed to use the public uh Ruby C
00:18:54.799 API and this list of functions this is
00:18:58.080 defined in
00:18:59.799 GC/GC.h this has functions that are not
00:19:02.320 exposed through the C API uh that the GC
00:19:05.120 implementation may want to use and let's
00:19:07.280 take a look at some of these in greater
00:19:08.640 detail
00:19:10.160 so we have functions for locking and
00:19:12.160 unlocking the Ruby VM as well as
00:19:14.080 functions to stop reactors this is
00:19:16.080 needed to run garbage collection cycles
00:19:18.000 because right now reactors cannot run
00:19:20.160 while a garbage collector is being run
00:19:23.039 uh we have functions to mark the
00:19:24.880 children the the child objects of a
00:19:26.880 particular object remember the GC
00:19:29.280 implementation is responsible for doing
00:19:31.200 the marking but the GC implementation is
00:19:33.760 not aware of how Ruby objects work
00:19:36.400 internally and so it is the
00:19:38.400 responsibility of the Ruby VM to walk
00:19:41.120 the references of an object so this
00:19:43.600 allows us to change uh how the data
00:19:45.919 structures of Ruby objects work without
00:19:48.559 needing to change the GC implementation
00:19:51.240 itself we also have a function that is
00:19:53.760 called after the GC has determined that
00:19:55.919 an object is dead and this is called
00:19:57.840 before the object is recycled by the GC
00:20:00.559 so then Ruby can perform cleanup such as
00:20:02.559 freeing memory or closing file
00:20:03.919 descriptors so we've now seen a brief
00:20:06.720 overview of how the modular GC API works
00:20:09.440 to build it you can follow the guide in
00:20:11.440 the linked readme here's what it looks
00:20:13.840 like it explains the steps to build Ruby
00:20:15.760 with modular GC enabled how to use the
00:20:18.320 two GC implementations that how to build
00:20:20.640 the two implementations that ship with
00:20:22.400 Ruby and how to use them uh the two
00:20:25.360 implementations that ship with Ruby is
00:20:27.760 the default GC and MMTK the default GC
00:20:31.679 is the GC that comes with Ruby but you
00:20:34.159 can modify it build it separately and
00:20:35.919 then load it into Ruby mmtk stands for
00:20:38.880 the uh memory management toolkit and
00:20:40.960 I'll talk about that in the next
00:20:42.840 section so this is part of the modular
00:20:45.440 GC work we are working on the first user
00:20:48.080 of the modular GC API that is not the
00:20:50.720 default GC the first user is MMTK which
00:20:54.320 stands for memory management toolkit
00:20:56.640 it's a language agnostic framework of
00:20:59.200 garbage collectors it is written in Rust
00:21:01.520 and is developed by the Australian
00:21:03.280 National University mmtk contains a
00:21:06.480 collection of garbage collection
00:21:08.000 algorithms and once a language plugs
00:21:10.159 into MMTK we can use a wide variety of
00:21:13.280 GC algorithms
00:21:15.440 mmtk is currently integrated with many
00:21:17.760 languages including uh Java on the Jes
00:21:20.400 VM and open JDK Julia V8 JavaScript
00:21:24.000 runtime and of course Ruby so let's take
00:21:27.520 a look at some of the features that MMTK
00:21:29.960 provides so as I mentioned before MMTK
00:21:33.360 provides a wide variety of GC algorithms
00:21:35.600 which it calls plans depending on how
00:21:38.400 much of the MMTK binding that you've
00:21:40.240 implemented you may only be able to use
00:21:42.159 some of these plans so first there's no
00:21:44.960 GC which as the name implies uh only
00:21:47.120 allocates objects and never reclaims
00:21:49.039 them this is useful for testing that
00:21:51.039 object allocation works without needing
00:21:52.880 to implement any garbage
00:21:55.000 collection mmtk also has the uh mark and
00:21:58.159 sweep garbage collector which is similar
00:21:59.760 to how the one in Ruby works and this is
00:22:02.000 the simplest garbage collector that
00:22:03.679 collects garbage but doesn't move any
00:22:05.600 objects we have mark and compact which
00:22:07.919 is similar to Ruby's garbage collector
00:22:09.679 with uh compaction enabled and moves
00:22:12.320 objects around to reduce fragmentation
00:22:14.880 we've also briefly discussed semi-space
00:22:16.880 copying before and MMTK also has this
00:22:19.440 but but it also doesn't support pinning
00:22:21.600 because uh all of the objects must be
00:22:23.360 able to move uh immix was also briefly
00:22:26.799 discussed earlier this is a higher
00:22:28.400 performing garbage collector that
00:22:29.840 supports pinning and there is a work
00:22:32.000 inprogress implementation of LXR in MMTK
00:22:35.360 this is currently only integrated with
00:22:36.960 the open JDK and LXR stands for latency
00:22:40.400 critical image with reference counting
00:22:42.480 it uses reference counting for
00:22:44.240 performance on reclaiming data objects
00:22:46.720 and IMIX for object movement to reduce
00:22:49.480 fragmentation the team saw significant
00:22:52.320 uh performance improvements when using
00:22:54.080 LXR over the default garbage first uh
00:22:56.960 garbage collector in the open JDK
00:23:00.240 mmtk natively supports uh parallelism by
00:23:03.440 using a concept called work packets each
00:23:06.400 work packet contains isolated work that
00:23:08.960 can be done on a different thread and so
00:23:11.280 then multiple threads can pick up
00:23:12.720 different work packets and then perform
00:23:14.559 tasks in parallel so there are currently
00:23:17.280 two implementations of MMTk for Ruby so
00:23:20.960 the original implementation of MMTk in
00:23:23.280 Ruby was was implemented by the MMTK
00:23:25.919 team this work started in 2020 and
00:23:28.400 continues to this day they use a fork of
00:23:31.039 Ruby and has implemented novel ideas
00:23:33.360 such as sidecar objects to reduce the
00:23:35.520 amount of memory allocated from the
00:23:37.120 system the concept of potentially
00:23:39.200 pinning parents to increase performance
00:23:41.440 for object movement and combining
00:23:43.440 marking an object movement into into a
00:23:45.760 single phase
00:23:47.679 with the introduction of modular GC
00:23:49.679 we're porting over work from the MMTK
00:23:52.000 team's implementation and implementing
00:23:53.919 it on top of modular GC we've so far
00:23:56.960 we've implemented no GC mark and sweep
00:23:59.360 and non-moving imix on modular GC so the
00:24:03.200 source code for all of this lives in the
00:24:05.320 Ruby/MMTK repository and uh please check
00:24:08.159 it out if you're interested so let's see
00:24:10.640 at a high level how MMTk is plugged into
00:24:13.360 Ruby using the modular GC API so inside
00:24:17.279 of the MMTK bindings we have a layer
00:24:20.080 that is written in C uh that interacts
00:24:22.720 with the modular GC API in Ruby uh the C
00:24:26.000 bindings here uh provide an extension of
00:24:28.720 Ruby for MMTK and and provides
00:24:31.200 implementation for things such as uh
00:24:33.679 stopping the Ruby VM marking an object
00:24:36.000 or freeing a dead
00:24:37.640 object the C bindings in MMTK then
00:24:40.480 interact with bindings written in Rust
00:24:43.039 and then the rust bindings interact with
00:24:44.799 MMTK core mmtk core here is installed as
00:24:48.000 a cargo dependency and uh MMTK core is
00:24:51.600 written in Rust mmtk core contains the
00:24:54.960 actual underlying implementation of the
00:24:57.200 garbage collector but the Rust bindings
00:24:59.760 in the Ruby/MMTK repository provides
00:25:02.799 extensions to MMTK core for Ruby and
00:25:06.080 implements things like defining work
00:25:07.919 packets for marking and freeing objects
00:25:11.760 so so so we've only worked on the
00:25:13.440 modular GC version of MMTK for less than
00:25:16.240 a year we are still currently far behind
00:25:19.200 the MMTK team's implementation and
00:25:21.360 because of this we are much slower than
00:25:23.120 their implementation and we are also
00:25:25.200 quite a bit slower than the Ruby's
00:25:27.039 default GC as well and um here are some
00:25:31.200 of the features that we are working on
00:25:33.200 that will significantly improve
00:25:34.799 performance so right now we only support
00:25:37.840 non-moving plans such as mark and sweep
00:25:39.840 and non-moving imix so we're adding
00:25:42.159 support for moving plans that will allow
00:25:44.159 imix to move objects which will improve
00:25:46.320 performance and reduce memory usage so
00:25:49.440 one detail I talked I did not talk about
00:25:52.159 the Ruby garbage collector uh is that it
00:25:54.480 is generational which means that objects
00:25:56.880 are young when they are first allocated
00:25:58.480 and become old when they survive for a
00:26:00.480 long time this speeds up the garbage
00:26:02.480 collector because we can run a garbage
00:26:04.080 collector that only scans young objects
00:26:06.000 since they are most likely to be dead
00:26:08.400 mmtk supports generational garbage
00:26:10.559 collector through plants such as sticky
00:26:12.640 imix which implements a sticky mark bit
00:26:15.120 to to only scan newly allocated objects
00:26:17.600 in a garbage collection
00:26:19.400 cycle mmtk operates on units of work
00:26:22.640 packets so the more that we can split
00:26:24.799 the work into smaller packets the more
00:26:26.880 mmtk can take advantage of parallelism
00:26:29.120 and thus improve performance
00:26:31.440 since Ruby's garbage collector has been
00:26:33.360 single threaded up until now there has
00:26:35.679 not been any work on making various
00:26:37.440 phases of the Ruby uh of the garbage
00:26:40.159 collection uh thread safes for example
00:26:43.440 you might expect that we can free two
00:26:45.440 different dead objects on two different
00:26:47.200 threads but uh the two different objects
00:26:50.720 may be accessing common data structures
00:26:53.120 and thus have race conditions
00:26:56.480 so in conclusion uh today we took a look
00:26:59.200 at how the Ruby garbage collector works
00:27:01.600 the motivation and implementation of
00:27:03.600 modular garbage collectors and the MMTK
00:27:06.640 integration with Ruby if you're
00:27:09.200 interested in building your own garbage
00:27:11.039 collector you can do that today with
00:27:12.880 modular GC uh note that this API is
00:27:16.400 experimental and subject to change so
00:27:19.360 don't use it in production just yet i'm
00:27:22.000 looking forward to seeing what you can
00:27:23.919 build you can find a copy of these
00:27:26.480 slides at this QR code if you have any
00:27:28.960 questions feel free to ask me after this
00:27:31.200 talk or through email or through social
00:27:33.120 media thank you for coming to my talk
Explore all talks recorded at RubyKaigi 2025
+66