00:00:04.400
hello Ruby Kai my name is Peter and I'm
00:00:08.480
on the Ruby core team and I'm a senior
00:00:10.719
developer at Shopify today I'll be
00:00:13.360
talking about modular garbage collectors
00:00:15.679
which is a experimental feature in Ruby
00:00:19.000
3.4 you can get a copy of these slides
00:00:22.000
by scanning this QR code or by following
00:00:24.400
this URL don't worry if you end up
00:00:26.640
missing this QR code because I'll also
00:00:28.480
show this QR code at the end of this
00:00:30.920
presentation so a little bit about me
00:00:33.280
first my name is Peter i'm currently
00:00:35.280
based in Toronto Canada i'm on the Ruby
00:00:38.239
core team and I'm a senior developer at
00:00:40.320
Shopify on the Ruby infrastructure team
00:00:43.040
where I work on performance and memory
00:00:44.800
management in Ruby i'm the co-author of
00:00:47.600
the variable with allocation in Ruby
00:00:49.840
which improves performance and memory
00:00:51.520
efficiency of the Ruby garbage collector
00:00:54.640
i'm also the co-author of the Ruby free
00:00:57.440
at exit feature which uh frees memory at
00:01:00.320
shutdown to allow the use of memory leak
00:01:02.559
checkers such as valgrind or Mac OS
00:01:04.640
leaks tool i'm also um the author of the
00:01:08.880
Ruby mem and autotuner gems and in my
00:01:11.360
free time I like to travel and take
00:01:13.280
photos and I post them on Instagram at
00:01:16.840
peterzoo.phos so I'm going to give a
00:01:19.280
brief overview of this talk first so
00:01:21.759
first I'll be talking a little bit about
00:01:23.600
Ruby's mark sweep compact garbage
00:01:25.759
collector then I'll be talking about the
00:01:27.920
missing features from this garbage
00:01:29.759
collector and the motivation of
00:01:31.360
introducing alternative garbage
00:01:33.119
collector
00:01:34.119
implementations next we'll be taking a
00:01:36.640
look at the modular garbage collector
00:01:38.320
feature in Ruby 3.4 which allows us to
00:01:40.799
use alternate garbage collector
00:01:42.479
implementations inside of Ruby we'll be
00:01:45.520
taking a look at how the garbage
00:01:47.520
collector communicates with the Ruby
00:01:49.040
virtual machine using the modular GC uh
00:01:51.520
API and briefly look at how to build
00:01:54.240
Ruby with modular GC enabled and then uh
00:01:57.439
we'll take a look at how to build the
00:01:59.040
two garbage collector implementations
00:02:00.960
that ship with
00:02:02.680
Ruby and finally we'll be taking a look
00:02:05.200
at MMTK which stands for the memory
00:02:07.520
management toolkit this is the first
00:02:09.759
user of the modular GC that is not the
00:02:12.640
built-in GC in Ruby we'll look at the uh
00:02:16.800
the features that MMTK supports and the
00:02:19.520
implementation of MMTK through the
00:02:21.920
modular GC API and then we'll take a
00:02:24.959
look at the future road map of MMTK and
00:02:27.200
modular GC let's first take a a quick
00:02:31.040
look at what a garbage collector is so
00:02:33.840
at a high level the garbage collector is
00:02:35.760
responsible for three things it first
00:02:38.000
performs object allocation when you
00:02:40.000
create a new object for example a hash
00:02:42.480
or an array or a class uh Ruby requests
00:02:45.519
a piece of memory from the garbage
00:02:47.280
collector and then during the allocation
00:02:49.840
of the object the garbage collector can
00:02:52.000
choose to run a garbage collection cycle
00:02:54.160
to free up some of that memory so it is
00:02:56.959
now responsible for determining which
00:02:58.800
objects are alive and which objects are
00:03:00.720
dead there are many techniques for doing
00:03:03.040
this and Ruby does this using tracing uh
00:03:05.760
by tracing object references from root
00:03:08.440
objects root objects are things such as
00:03:11.200
global variables and constants inside of
00:03:13.360
Ruby after determining the liveless of
00:03:16.319
objects uh the garbage collector
00:03:18.480
reclaims objects that have died and this
00:03:21.120
includes reclaiming resources of objects
00:03:24.360
uh resources of the object such as
00:03:27.440
memory allocated from the system closing
00:03:29.760
file descriptors and running finalizers
00:03:32.400
and once uh the the object has been
00:03:34.879
reclaimed by the garbage collector this
00:03:36.560
memory can be reused by the garbage
00:03:38.239
collector for a new
00:03:39.959
object so to allocate object Ruby uses a
00:03:43.480
freelocator and uh how the free list
00:03:46.080
allocator works is very simple it's just
00:03:48.000
a link list of empty memory that we can
00:03:50.400
allocate into for example here we have a
00:03:53.280
Ruby heap with space for six objects we
00:03:55.599
have three objects that are allocated
00:03:57.200
here which are of type string array and
00:04:00.000
regular expression and we have three
00:04:01.920
empty slots here that we can allocate
00:04:03.519
objects into for the free list allocator
00:04:06.000
we build a linked list of slots just
00:04:08.319
like this and then when we need to
00:04:10.319
allocate an object and uh it it
00:04:13.200
allocates into the head of the free list
00:04:14.959
and we move the free list to the next
00:04:16.720
element and then we can repeat this
00:04:18.880
until all of the slots have been filled
00:04:22.199
up and then once all the slots have been
00:04:24.639
filled up what do we do we run a garbage
00:04:26.880
collection cycle to reclaim any dead
00:04:29.440
objects ruby uses the mark sweep compact
00:04:32.320
garbage collector which as the name
00:04:33.919
implies has three phases first we have
00:04:36.240
the mark phase which determines which
00:04:37.919
objects are alive the sweep phase which
00:04:40.160
determines of which reclaims all of the
00:04:41.840
dead objects and the optional compact
00:04:43.759
phase which moves objects so we don't
00:04:45.840
have holes or fragmentation in the
00:04:48.120
memory so now I'll show the marking
00:04:50.479
phase using an example so here uh we
00:04:53.280
have an example heap uh from the
00:04:55.360
previous example and let's add object
00:04:57.759
references here where arrows indicate
00:04:59.840
that a particular object refers to
00:05:01.520
another object and we have the root
00:05:03.759
objects here which include things like
00:05:05.360
constants and Ruby objects on the stack
00:05:07.840
so we start marking from the roots and
00:05:09.759
the objects that have been marked but
00:05:11.520
not yet traversed are marked gray so
00:05:14.000
let's pick the gray object here and so
00:05:16.560
here let's pick the array uh and let's
00:05:18.720
traverse the children of the array and
00:05:20.400
we mark this array as black and we mark
00:05:23.120
the children of the array as gray which
00:05:25.520
is the object in this case the class
00:05:27.919
here is already marked gray so there's
00:05:29.840
nothing we need to do and then let's
00:05:31.680
pick the next gray object which is the
00:05:33.360
class and since this class refers to no
00:05:35.600
objects there's nothing we need to do
00:05:37.680
and now let's mark the gray object black
00:05:39.680
and mark its children which is the
00:05:41.680
string and then we mark the last gray
00:05:43.680
object black and since this object has
00:05:46.400
no uh children there's nothing we need
00:05:48.160
to mark and since we no longer have any
00:05:50.560
more gray objects uh this marking phase
00:05:53.120
has is now complete um and so after the
00:05:56.400
marking phase uh all of the black
00:05:58.160
objects are alive and all of the white
00:06:00.080
objects are dead and can be reclaimed by
00:06:02.000
the garbage collector so let's go back
00:06:04.319
to this example uh so for the marking
00:06:06.720
phase we scan the heap and we reclaim
00:06:09.600
any white objects and we have two here
00:06:12.000
so the garbage collector will also
00:06:13.759
reclaim any additional resources of the
00:06:15.680
object such as memory allocated from the
00:06:17.360
system or open file descriptors
00:06:19.600
compaction is an optional phase of
00:06:21.759
garbage collection that can be manually
00:06:23.600
enabled it moves objects around the heap
00:06:26.400
so that all of the objects are towards
00:06:28.000
the one end of the heap and this can
00:06:29.840
help us save memory because it prevents
00:06:31.680
issues known as internal fragmentation
00:06:34.560
which is where chunks of unused me uh
00:06:36.800
unused memory is interled with uh used
00:06:39.919
memory and prevents memory from being
00:06:41.840
freed which increases memory usage ruby
00:06:44.880
uses the two-finger algorithm for
00:06:46.720
compaction and let's see how that works
00:06:48.800
using the example heap and as the name
00:06:50.960
implies there are two fingers or cursors
00:06:53.680
so from the left we have the free cursor
00:06:56.560
this cursor moves from the left to the
00:06:58.400
right until the first empty slot so here
00:07:02.080
it moves to the right till the first
00:07:03.280
empty slot and then from the right we
00:07:04.800
have the scan cursor which points to the
00:07:07.120
first live object in the heap then we
00:07:10.479
swap the objects at these two locations
00:07:12.960
and we leave a temporary object at the
00:07:14.960
original location known as a forwarding
00:07:16.880
reference and we'll see how this is used
00:07:18.800
in the next phase of compaction we then
00:07:21.440
move the free cursor to the right onto
00:07:23.199
the next free slot and then the scan
00:07:25.039
cursor to the left to the next live
00:07:26.800
object but since the two cursors have
00:07:28.880
now met uh this phase of compaction is
00:07:31.039
complete we then enter the reference
00:07:33.199
updating phase uh and here we modify
00:07:36.319
references from objects that refer to a
00:07:38.319
t-moved object uh so let's add the
00:07:41.039
object references back we now linearly
00:07:44.080
scan every object and find objects that
00:07:46.319
refer to moved objects we can see that
00:07:48.880
this array here points to a troved
00:07:50.960
object and we can update this reference
00:07:53.120
to point to the new location for the
00:07:55.199
object and then we can continue scanning
00:07:57.440
the remaining objects and then uh once
00:08:00.160
we finish that we can reclaim all of the
00:08:02.560
t-moved
00:08:03.800
objects so that was a brief and
00:08:06.479
simplified introduction to how the mark
00:08:08.800
sweep compact garbage collector works in
00:08:11.120
Ruby there are many optimizations and
00:08:13.840
implementation details that were omitted
00:08:15.919
from this introduction so I have a blog
00:08:18.639
post here linked that describes how the
00:08:21.039
garbage collector works in much greater
00:08:23.120
detail and please check it out after the
00:08:25.280
talk if you're
00:08:26.520
interested so we already have a garbage
00:08:29.120
collector in Ruby that's been
00:08:30.319
continuously improved upon for the past
00:08:32.640
30 years it works and provides decent
00:08:35.440
performance so now let's take a look at
00:08:37.599
some of the features that Ruby that the
00:08:39.760
Ruby garbage collector is
00:08:41.640
missing so Ruby's garbage collector uses
00:08:44.320
the mark sweep compact algorithm which
00:08:46.160
is a fairly simple algorithm to
00:08:48.080
implement and understand but there are
00:08:50.399
many other garbage collection algorithms
00:08:52.320
that can provide higher performance so
00:08:55.440
one garbage collection algorithm that
00:08:57.279
are used that is used by some languages
00:08:58.959
such as Python is called reference
00:09:01.120
counting reference counting keeps track
00:09:03.440
of the references from one object to
00:09:05.600
another and so when it detects that an
00:09:07.760
object no longer has references to it it
00:09:09.920
is dead and can be reclaimed by the
00:09:11.440
garbage
00:09:12.360
collector we can almost use this in Ruby
00:09:15.519
however this requires that we can
00:09:17.600
accurately keep track of when references
00:09:20.000
from an object is added and removed this
00:09:23.120
is implemented in Ruby using a concept
00:09:25.120
called write barriers these are
00:09:27.120
callbacks to the garbage collector when
00:09:28.959
an object adds a a reference to another
00:09:32.040
object however the implementation of
00:09:34.720
rightbearers is incomplete in Ruby
00:09:37.120
because it was not designed with
00:09:38.560
reference counting in mind for example
00:09:41.200
there are many cases where removing a
00:09:43.519
reference from an object does not fire
00:09:45.519
the right barrier which would make
00:09:46.880
reference counting not work properly you
00:09:49.680
will notice this mo many times
00:09:51.440
throughout this talk by having the same
00:09:54.080
garbage collector algorithm for the past
00:09:56.480
30 years means that there are there are
00:09:58.880
ingrained assumptions about how the
00:10:00.640
garbage collector works in the Ruby VM
00:10:03.360
and this prevents flexibility in the
00:10:05.360
garbage collection algorithm that we can
00:10:07.519
use so right now we cannot use reference
00:10:10.240
counting in
00:10:11.959
Ruby semiece copying garbage collector
00:10:15.040
has two regions of memory which are
00:10:16.640
called the from space and the two space
00:10:18.640
we allocate in the from space until we
00:10:21.040
ran out of memory and then we run a
00:10:22.720
garbage collection cycle by moving all
00:10:24.959
of the live objects from the from space
00:10:26.720
to the two space and let's see a quick
00:10:28.880
example of how this works so we allocate
00:10:32.160
uh objects in the from space using a
00:10:33.760
bump pointer allocator which means that
00:10:36.079
we allocate an allocate objects from the
00:10:38.560
top and move the pointer forward and
00:10:40.720
then when we fill up the from space it
00:10:43.120
is time to run a garbage collection
00:10:44.560
cycle to free up uh space and uh we
00:10:48.079
let's mark the live objects as black
00:10:50.240
here we then move the live objects to
00:10:52.880
the twospace and then we can reclaim all
00:10:55.360
of the memory from the from space since
00:10:58.079
we know that all of the objects in the
00:10:59.760
from space are dead we then swap the
00:11:02.480
from space and the two space which
00:11:04.000
guarantees that we always allocate
00:11:05.519
objects into the from space and so it
00:11:08.160
should be clear what the advantages of
00:11:10.399
this algorithm is so every garbage
00:11:13.120
collection cycle defragments the heap uh
00:11:15.920
so it has compaction built in however
00:11:19.040
this ends up using twice the amount of
00:11:21.200
memory and this is definitely a case
00:11:23.200
where we trade performance for higher
00:11:24.959
memory usage however again we cannot use
00:11:28.000
this algorithm in Ruby because Ruby
00:11:30.480
requires that the GC algorithm supports
00:11:32.560
pinning which is where uh certain
00:11:34.720
objects cannot be moved but semi-pace
00:11:37.279
copying requires that all live objects
00:11:39.519
must be able to move so we cannot use
00:11:41.839
ref uh semi-pace copying right now
00:11:45.000
imix is a garbage collection algorithm
00:11:47.519
that combines ideas from mark and sweep
00:11:49.519
and semi-space copying imix splits the
00:11:52.000
heap into many blocks and using
00:11:54.160
heristics it determines which blocks are
00:11:56.240
most likely to be fragmented then during
00:11:58.480
marking it moves live objects in blocks
00:12:00.720
that it determines to be the most
00:12:02.040
fragmented this gives us the benefits of
00:12:04.399
semispace copying because it moves
00:12:06.320
objects during marking and allows for a
00:12:08.160
pinning support because not uh because
00:12:10.240
not all objects are required to move so
00:12:12.480
we can use this algorithm in Ruby
00:12:15.440
another feature missing from Ruby's
00:12:17.200
garbage collector is parallelism as we
00:12:20.240
probably know uh with the exception of
00:12:22.480
reactors Ruby does not support
00:12:24.600
parallelism so multiple threads in Ruby
00:12:27.200
don't run at the same time this
00:12:29.040
simplifies the implementation and can
00:12:30.800
prevent race conditions however the CPUs
00:12:33.600
in modern machines have uh multiple CPU
00:12:36.320
cores and we are not taking advantage of
00:12:38.399
that also if you were in Kohici Sasada's
00:12:41.279
talk earlier today about Ractor local GC
00:12:44.560
you might know that the GC is currently
00:12:46.639
shared between ractors and um since and
00:12:50.880
and because of that uh rack uh when we
00:12:53.519
run a garbage collection cycle we need
00:12:55.200
to pause all of the ractors so it ends
00:12:57.040
up becoming a bottleneck so by making
00:12:59.600
the garbage collector parallel we can
00:13:01.600
improve performance for Ruby
00:13:03.120
applications and make it less of a
00:13:04.959
bottleneck for ractors
00:13:07.279
for the longest time every object in
00:13:09.279
Ruby was exactly 40 bytes in size any
00:13:11.920
additional memory was allocated through
00:13:14.320
the system using Maloc this was done for
00:13:16.800
simplicity because it avoided internal
00:13:18.880
fragmentation however it was inefficient
00:13:21.360
for memory usage uh because uh there was
00:13:24.000
additional space required to store
00:13:25.600
pointers to other regions of memory and
00:13:27.680
it was also bad for performance because
00:13:29.839
it required more memory reads to alloc
00:13:31.839
to uh to to uh read to access the data
00:13:34.959
for the particular object
00:13:37.360
so in Ruby 3.2 I co-authored variable
00:13:39.839
with allocation which allowed the
00:13:41.519
garbage collector to allocate dynamic
00:13:43.760
slot sizes however due to technical
00:13:46.240
limitations we are limited to five slot
00:13:48.560
sizes which are powers of two multiples
00:13:50.959
of 40 bytes which this means that the
00:13:53.360
slot sizes are 40 bytes 80 bytes 160
00:13:56.399
bytes 320 bytes and 640 bytes any
00:13:59.360
objects that require more than 640 bytes
00:14:01.360
in space would have that additional uh
00:14:03.760
memory uh allocated from the system via
00:14:06.040
maloc uh through this feature we saw 2
00:14:08.639
to 10 perfor% performance improvements
00:14:10.800
in various
00:14:12.519
benchmarks other garbage collectors are
00:14:14.959
able to implement truly dynamic slot
00:14:17.040
sizes without the slot size limitation
00:14:18.880
that variable width allocation has so
00:14:21.440
then this will allow us to be able to
00:14:23.360
further improve per performance and
00:14:25.279
decrease memory usage
00:14:28.000
so now we've seen some of the reasons
00:14:30.720
why using a different garbage collector
00:14:32.880
implementation could improve performance
00:14:34.720
for Ruby so you might be asking why we
00:14:37.120
don't just rewrite Ruby's garbage
00:14:39.000
collector so we chose to not replace
00:14:41.839
Ruby's current garbage collector for
00:14:43.600
several reasons all of the garbage
00:14:45.920
collectors are a trade-off between
00:14:47.600
performance memory usage and complexity
00:14:50.560
ruby's mark and sweep garbage collector
00:14:52.720
is great at reducing memory usage and
00:14:55.040
has relatively low complexity ruby
00:14:57.920
programs range from small scripts that
00:14:59.680
take milliseconds to run to web
00:15:01.600
application that serve a huge amount of
00:15:03.600
requests so there's really no garbage
00:15:05.760
collector that is one
00:15:07.639
sizefits-all ruby's current garbage
00:15:09.600
collector is also very battle tested and
00:15:11.760
provides a good balance of performance
00:15:13.839
memory usage and stability so we don't
00:15:16.399
want to replace it and also if we did
00:15:19.040
replace the Ruby garbage collector and
00:15:21.199
then in the future we wanted to switch
00:15:23.120
to a even higher performing garbage
00:15:25.199
collector we would need to repeat this
00:15:27.120
process all over
00:15:28.839
again so to solve this problem we are
00:15:31.519
working on modular garbage collectors
00:15:33.680
this allows the user to switch garbage
00:15:35.920
collector implementations depending on
00:15:37.760
their requirements and workload this was
00:15:40.320
released as an experimental feature in
00:15:42.639
Ruby 3.4 for and implements an API to
00:15:45.680
implement alternate garbage collectors
00:15:47.839
and a mechanism to load them into Ruby
00:15:50.880
this feature was introduced in ticket
00:15:53.000
20470 and you can read it if you're
00:15:55.120
interested in more details the first
00:15:57.839
part of this feature involved splitting
00:15:59.680
Ruby's garbage collector into two parts
00:16:02.160
previously Ruby's garbage collector
00:16:03.839
lived in a file called GC c which was
00:16:06.000
compiled into the Ruby binary so we
00:16:08.959
first split the this file into two the
00:16:11.279
existing GC.c C file no longer contains
00:16:13.839
any garbage collector implementation
00:16:16.079
rather it only interfaces the Ruby VM
00:16:18.959
for the GC and so this does things like
00:16:22.240
start and stop reactors for running
00:16:24.320
garbage collection cycles this new file
00:16:26.959
which is under the garb under the GC
00:16:28.959
directory called default C contains the
00:16:31.600
actual implementation of the mark and
00:16:34.079
sweep GC that is in Ruby this allows us
00:16:37.519
to build the default GC separately from
00:16:40.240
the Ruby binary and then the GC uh
00:16:43.120
implementation is loaded into Ruby as a
00:16:45.519
shared library similar to how native
00:16:47.839
extensions are loaded and then the
00:16:49.920
communication between the default GC and
00:16:52.000
Ruby is done through the modular GC
00:16:54.680
API the GC implementation communicates
00:16:57.440
with Ruby through the GC API this API
00:17:00.240
contains two parts the API that the GC
00:17:02.560
implementation is required to implement
00:17:04.880
and the API that the GC implementation
00:17:06.959
is allowed to use in
00:17:08.679
Ruby this header files contains the
00:17:11.120
function prototypes for all of the
00:17:12.720
functions that the GC implementation is
00:17:14.559
required to implement this file lives
00:17:16.799
under
00:17:18.280
GC/GCIPOLE.h and there are about four
00:17:20.480
about 60 functions split across 14
00:17:22.559
sections and let's take a look at some
00:17:24.400
of these in greater detail so the first
00:17:26.559
section are the bootup functions these
00:17:28.400
are ran a boot to allocate data
00:17:30.000
structures for the garbage collector and
00:17:31.600
do and perform setup there are many
00:17:33.840
functions here because there are various
00:17:35.440
bootup phases in Ruby so we have object
00:17:38.000
space allocation which allocates the
00:17:39.919
data structures for the garbage
00:17:41.200
collector and this is ran before the VM
00:17:43.600
has booted because we need a functional
00:17:45.679
garbage collector for the VM to boot uh
00:17:48.400
we also have functions to allocate data
00:17:50.320
structures for ractors and another
00:17:52.320
initializer function that is ran after
00:17:54.480
the Ruby VM has booted which allows the
00:17:56.799
GC to define methods and do other
00:17:59.120
operations that require a functional VM
00:18:01.280
to work we also have shutdown functions
00:18:03.919
which uh performs cleanup and freeze
00:18:06.080
data structures in the GC we also have
00:18:08.640
functions involving uh running a GC
00:18:11.200
cycle such as manually running a GC uh
00:18:14.000
disabling and enabling the GC and
00:18:16.320
setting configuration we have object
00:18:18.400
allocation which asks asks the GC for an
00:18:21.280
object of a specific size and then the
00:18:23.360
GC returns a pointer to the object that
00:18:25.360
was allocated and we have functions for
00:18:27.440
the GC to mark an object as as alive and
00:18:30.080
to pin an object to ensure that it does
00:18:31.760
not move and we also have write barriers
00:18:33.919
which are callbacks so that the GC know
00:18:35.760
when an object has a new reference to
00:18:37.919
another
00:18:38.919
object so that was the API that the GC
00:18:41.679
implementation is required to implement
00:18:44.080
so to ensure that the GC implementation
00:18:46.160
has abstraction from Ruby internals the
00:18:49.039
GC implementation is only required is
00:18:51.679
only allowed to use the public uh Ruby C
00:18:54.799
API and this list of functions this is
00:18:58.080
defined in
00:18:59.799
GC/GC.h this has functions that are not
00:19:02.320
exposed through the C API uh that the GC
00:19:05.120
implementation may want to use and let's
00:19:07.280
take a look at some of these in greater
00:19:08.640
detail
00:19:10.160
so we have functions for locking and
00:19:12.160
unlocking the Ruby VM as well as
00:19:14.080
functions to stop reactors this is
00:19:16.080
needed to run garbage collection cycles
00:19:18.000
because right now reactors cannot run
00:19:20.160
while a garbage collector is being run
00:19:23.039
uh we have functions to mark the
00:19:24.880
children the the child objects of a
00:19:26.880
particular object remember the GC
00:19:29.280
implementation is responsible for doing
00:19:31.200
the marking but the GC implementation is
00:19:33.760
not aware of how Ruby objects work
00:19:36.400
internally and so it is the
00:19:38.400
responsibility of the Ruby VM to walk
00:19:41.120
the references of an object so this
00:19:43.600
allows us to change uh how the data
00:19:45.919
structures of Ruby objects work without
00:19:48.559
needing to change the GC implementation
00:19:51.240
itself we also have a function that is
00:19:53.760
called after the GC has determined that
00:19:55.919
an object is dead and this is called
00:19:57.840
before the object is recycled by the GC
00:20:00.559
so then Ruby can perform cleanup such as
00:20:02.559
freeing memory or closing file
00:20:03.919
descriptors so we've now seen a brief
00:20:06.720
overview of how the modular GC API works
00:20:09.440
to build it you can follow the guide in
00:20:11.440
the linked readme here's what it looks
00:20:13.840
like it explains the steps to build Ruby
00:20:15.760
with modular GC enabled how to use the
00:20:18.320
two GC implementations that how to build
00:20:20.640
the two implementations that ship with
00:20:22.400
Ruby and how to use them uh the two
00:20:25.360
implementations that ship with Ruby is
00:20:27.760
the default GC and MMTK the default GC
00:20:31.679
is the GC that comes with Ruby but you
00:20:34.159
can modify it build it separately and
00:20:35.919
then load it into Ruby mmtk stands for
00:20:38.880
the uh memory management toolkit and
00:20:40.960
I'll talk about that in the next
00:20:42.840
section so this is part of the modular
00:20:45.440
GC work we are working on the first user
00:20:48.080
of the modular GC API that is not the
00:20:50.720
default GC the first user is MMTK which
00:20:54.320
stands for memory management toolkit
00:20:56.640
it's a language agnostic framework of
00:20:59.200
garbage collectors it is written in Rust
00:21:01.520
and is developed by the Australian
00:21:03.280
National University mmtk contains a
00:21:06.480
collection of garbage collection
00:21:08.000
algorithms and once a language plugs
00:21:10.159
into MMTK we can use a wide variety of
00:21:13.280
GC algorithms
00:21:15.440
mmtk is currently integrated with many
00:21:17.760
languages including uh Java on the Jes
00:21:20.400
VM and open JDK Julia V8 JavaScript
00:21:24.000
runtime and of course Ruby so let's take
00:21:27.520
a look at some of the features that MMTK
00:21:29.960
provides so as I mentioned before MMTK
00:21:33.360
provides a wide variety of GC algorithms
00:21:35.600
which it calls plans depending on how
00:21:38.400
much of the MMTK binding that you've
00:21:40.240
implemented you may only be able to use
00:21:42.159
some of these plans so first there's no
00:21:44.960
GC which as the name implies uh only
00:21:47.120
allocates objects and never reclaims
00:21:49.039
them this is useful for testing that
00:21:51.039
object allocation works without needing
00:21:52.880
to implement any garbage
00:21:55.000
collection mmtk also has the uh mark and
00:21:58.159
sweep garbage collector which is similar
00:21:59.760
to how the one in Ruby works and this is
00:22:02.000
the simplest garbage collector that
00:22:03.679
collects garbage but doesn't move any
00:22:05.600
objects we have mark and compact which
00:22:07.919
is similar to Ruby's garbage collector
00:22:09.679
with uh compaction enabled and moves
00:22:12.320
objects around to reduce fragmentation
00:22:14.880
we've also briefly discussed semi-space
00:22:16.880
copying before and MMTK also has this
00:22:19.440
but but it also doesn't support pinning
00:22:21.600
because uh all of the objects must be
00:22:23.360
able to move uh immix was also briefly
00:22:26.799
discussed earlier this is a higher
00:22:28.400
performing garbage collector that
00:22:29.840
supports pinning and there is a work
00:22:32.000
inprogress implementation of LXR in MMTK
00:22:35.360
this is currently only integrated with
00:22:36.960
the open JDK and LXR stands for latency
00:22:40.400
critical image with reference counting
00:22:42.480
it uses reference counting for
00:22:44.240
performance on reclaiming data objects
00:22:46.720
and IMIX for object movement to reduce
00:22:49.480
fragmentation the team saw significant
00:22:52.320
uh performance improvements when using
00:22:54.080
LXR over the default garbage first uh
00:22:56.960
garbage collector in the open JDK
00:23:00.240
mmtk natively supports uh parallelism by
00:23:03.440
using a concept called work packets each
00:23:06.400
work packet contains isolated work that
00:23:08.960
can be done on a different thread and so
00:23:11.280
then multiple threads can pick up
00:23:12.720
different work packets and then perform
00:23:14.559
tasks in parallel so there are currently
00:23:17.280
two implementations of MMTk for Ruby so
00:23:20.960
the original implementation of MMTk in
00:23:23.280
Ruby was was implemented by the MMTK
00:23:25.919
team this work started in 2020 and
00:23:28.400
continues to this day they use a fork of
00:23:31.039
Ruby and has implemented novel ideas
00:23:33.360
such as sidecar objects to reduce the
00:23:35.520
amount of memory allocated from the
00:23:37.120
system the concept of potentially
00:23:39.200
pinning parents to increase performance
00:23:41.440
for object movement and combining
00:23:43.440
marking an object movement into into a
00:23:45.760
single phase
00:23:47.679
with the introduction of modular GC
00:23:49.679
we're porting over work from the MMTK
00:23:52.000
team's implementation and implementing
00:23:53.919
it on top of modular GC we've so far
00:23:56.960
we've implemented no GC mark and sweep
00:23:59.360
and non-moving imix on modular GC so the
00:24:03.200
source code for all of this lives in the
00:24:05.320
Ruby/MMTK repository and uh please check
00:24:08.159
it out if you're interested so let's see
00:24:10.640
at a high level how MMTk is plugged into
00:24:13.360
Ruby using the modular GC API so inside
00:24:17.279
of the MMTK bindings we have a layer
00:24:20.080
that is written in C uh that interacts
00:24:22.720
with the modular GC API in Ruby uh the C
00:24:26.000
bindings here uh provide an extension of
00:24:28.720
Ruby for MMTK and and provides
00:24:31.200
implementation for things such as uh
00:24:33.679
stopping the Ruby VM marking an object
00:24:36.000
or freeing a dead
00:24:37.640
object the C bindings in MMTK then
00:24:40.480
interact with bindings written in Rust
00:24:43.039
and then the rust bindings interact with
00:24:44.799
MMTK core mmtk core here is installed as
00:24:48.000
a cargo dependency and uh MMTK core is
00:24:51.600
written in Rust mmtk core contains the
00:24:54.960
actual underlying implementation of the
00:24:57.200
garbage collector but the Rust bindings
00:24:59.760
in the Ruby/MMTK repository provides
00:25:02.799
extensions to MMTK core for Ruby and
00:25:06.080
implements things like defining work
00:25:07.919
packets for marking and freeing objects
00:25:11.760
so so so we've only worked on the
00:25:13.440
modular GC version of MMTK for less than
00:25:16.240
a year we are still currently far behind
00:25:19.200
the MMTK team's implementation and
00:25:21.360
because of this we are much slower than
00:25:23.120
their implementation and we are also
00:25:25.200
quite a bit slower than the Ruby's
00:25:27.039
default GC as well and um here are some
00:25:31.200
of the features that we are working on
00:25:33.200
that will significantly improve
00:25:34.799
performance so right now we only support
00:25:37.840
non-moving plans such as mark and sweep
00:25:39.840
and non-moving imix so we're adding
00:25:42.159
support for moving plans that will allow
00:25:44.159
imix to move objects which will improve
00:25:46.320
performance and reduce memory usage so
00:25:49.440
one detail I talked I did not talk about
00:25:52.159
the Ruby garbage collector uh is that it
00:25:54.480
is generational which means that objects
00:25:56.880
are young when they are first allocated
00:25:58.480
and become old when they survive for a
00:26:00.480
long time this speeds up the garbage
00:26:02.480
collector because we can run a garbage
00:26:04.080
collector that only scans young objects
00:26:06.000
since they are most likely to be dead
00:26:08.400
mmtk supports generational garbage
00:26:10.559
collector through plants such as sticky
00:26:12.640
imix which implements a sticky mark bit
00:26:15.120
to to only scan newly allocated objects
00:26:17.600
in a garbage collection
00:26:19.400
cycle mmtk operates on units of work
00:26:22.640
packets so the more that we can split
00:26:24.799
the work into smaller packets the more
00:26:26.880
mmtk can take advantage of parallelism
00:26:29.120
and thus improve performance
00:26:31.440
since Ruby's garbage collector has been
00:26:33.360
single threaded up until now there has
00:26:35.679
not been any work on making various
00:26:37.440
phases of the Ruby uh of the garbage
00:26:40.159
collection uh thread safes for example
00:26:43.440
you might expect that we can free two
00:26:45.440
different dead objects on two different
00:26:47.200
threads but uh the two different objects
00:26:50.720
may be accessing common data structures
00:26:53.120
and thus have race conditions
00:26:56.480
so in conclusion uh today we took a look
00:26:59.200
at how the Ruby garbage collector works
00:27:01.600
the motivation and implementation of
00:27:03.600
modular garbage collectors and the MMTK
00:27:06.640
integration with Ruby if you're
00:27:09.200
interested in building your own garbage
00:27:11.039
collector you can do that today with
00:27:12.880
modular GC uh note that this API is
00:27:16.400
experimental and subject to change so
00:27:19.360
don't use it in production just yet i'm
00:27:22.000
looking forward to seeing what you can
00:27:23.919
build you can find a copy of these
00:27:26.480
slides at this QR code if you have any
00:27:28.960
questions feel free to ask me after this
00:27:31.200
talk or through email or through social
00:27:33.120
media thank you for coming to my talk