Summarized using AI

From C extension to pure C: Migrating RBS

Alexander Momchilov • April 17, 2025 • Matsuyama, Ehime, Japan • Talk

In his talk at RubyKaigi 2025, Alexander Momchilov discusses the migration of RBS (Ruby Signature) from a Ruby gem with a C extension to a pure C library. This transition aims to enhance performance, memory efficiency, and portability, making RBS usable with various Ruby tools such as Prism, Sorbet, JRuby, and TruffleRuby.

Key Points:

- Introduction to RBS: RBS is a standard notation for type annotations in Ruby, used for static type checking without being executed by the Ruby VM.

- Why Migrate to Pure C: The original RBS depended on Ruby VM features which limited portability and multi-threading capabilities, particularly due to the Global VM Lock (GVL). Moving to pure C allows multiple threads to work in parallel.

- Maintaining Developer Experience: Inline RBS—where type annotations are added as comments within Ruby files—was introduced to improve developer experience over separate RBS files. This syncs easily with changing Ruby code.

- Performance Benefits: The pure C implementation is faster and more memory-efficient. The use of a C++ type checker (Sorbet) allows parsing multiple RBS files in parallel without being constrained by Ruby’s GVL.

- Handling C Programming Challenges: Transitioning to pure C necessitates addressing error handling, memory management, and the use of C-standard libraries instead of Ruby objects or exceptions.

Specific Examples Discussed: - Error Handling: C does not have built-in exception handling like Ruby. The solution involves returning boolean values to indicate success or failure, with error propagation handled manually.

- Memory Management: Switching from Ruby's garbage collection to manual memory management required implementing custom allocators that allow for simpler object lifecycle management and bulk freeing of objects.

- Conclusion and Takeaways: The presentation highlights the benefits of using a pure C model, including improved portability across different Ruby environments and greater control over performance. Developers are encouraged to adopt careful practices around error handling and memory management to maximize the performance benefits of using C libraries in Ruby.

From C extension to pure C: Migrating RBS
Alexander Momchilov • Matsuyama, Ehime, Japan • Talk

Date: April 17, 2025
Published: May 27, 2025
Announced: unknown

Learn how we migrated RBS to remove its dependency on the Ruby VM and expose a new C API. In addition to being faster and more memory-efficient, it's now more portable: tools like Prism, Sorbet, JRuby and TruffleRuby will be able to use RBS directly. Type checkers like Steep and Sorbet will now be able to parse multiple RBS files in parallel, unconstrained by the GVL.

The Ruby VM offers many luxuries that can help ease C extension development, such as garbage collection, exceptions, and the many built-in data structures like `Array` and `Hash`. Unfortunately, to be maximally portable and multi-threaded, some C extensions like RBS and Prism will need to forego these conveniences. We'll show techniques for replicating them in pure C.

Join us to explore advanced techniques in writing C extensions and see how this universal RBS parser paves the way for improved tooling and collaboration in the Ruby ecosystem.

https://rubykaigi.org/2025/presentations/amomchilov.html

RubyKaigi 2025

00:00:04.480 all right good afternoon everyone
00:00:06.160 welcome to coming to my first ever
00:00:08.320 conference talk it's called from C
00:00:10.240 extension to pure C it's a story of how
00:00:12.639 my team and I migrated RBS from being a
00:00:15.280 Ruby gem with a C extension to a Pure C
00:00:19.000 library let's jump into it first I'll
00:00:21.520 introduce myself my name is Alexander
00:00:23.039 Mumilof i'm a staff developer at the
00:00:25.119 Ruby developer experience team at
00:00:26.599 Shopify i flew in from Toronto Canada
00:00:30.080 and I'm actually one of three different
00:00:31.920 presentations that my team is giving
00:00:33.360 this year about type checking and static
00:00:35.800 analysis yesterday my co-orker Venicia
00:00:38.399 Stock presented embracing Ruby magic
00:00:40.559 statically analyzing uh DSLs and that
00:00:43.760 was yesterday so you can either jump in
00:00:45.200 your time machine go back and see it
00:00:47.120 then or you can just wait about a month
00:00:48.719 to see the recording there's this talk
00:00:51.440 now and tomorrow my co-orker Alexandre
00:00:53.920 is presenting inline RBS comments for
00:00:56.320 seamless type checking with Sorbet and
00:00:58.160 that's in the same hall tomorrow at
00:01:00.120 14:00 and so yes there's actually two
00:01:02.559 different Alex's presenting RBS related
00:01:04.960 talks just for my team alone and if your
00:01:07.680 name is also Alex we're hiring you can
00:01:09.760 join and make an RBS talk of your own no
00:01:12.880 just kidding any name can apply shopify
00:01:15.040 is always looking for exceptional talent
00:01:16.640 so if you're interested please scan the
00:01:18.000 QR code i'll leave it up for a moment
00:01:19.439 longer
00:01:20.720 but let me get into what is RBS and what
00:01:22.720 are we going to talk about today so in
00:01:24.799 short RBS stands for Ruby
00:01:27.000 signature and it's a standard notation
00:01:29.119 for describing the types of various
00:01:30.960 constructs in your code so you can
00:01:32.799 annotate what you expect the types of
00:01:34.479 your local variables instance variables
00:01:36.640 method return values parameters and so
00:01:38.799 on to be at
00:01:40.439 runtime originally RBS was written in
00:01:43.200 separate RBS files these are files that
00:01:45.840 aren't executed by the Ruby virtual
00:01:47.439 machine they're not even seen by the
00:01:48.880 interpreter at all instead they're sort
00:01:51.520 of exist in parallel in your same repo
00:01:53.920 for static analysis tooling to read not
00:01:56.000 for actual Ruby code to
00:01:57.799 execute inline RBS was a sort of
00:02:00.399 development that let you write your the
00:02:02.640 same RBS signatures that you had in your
00:02:04.399 RBS files but now inline in your RB
00:02:07.119 files into comments i'll show you
00:02:09.759 examples of both of these when your
00:02:11.680 programs have RBS signatures in them you
00:02:13.520 can use a type checker to automatically
00:02:15.040 validate the correctness of your program
00:02:16.720 and to make sure that all the types line
00:02:18.480 up if you're passing a string to some
00:02:20.319 function that expects an integer it will
00:02:21.920 tell you that something went wrong so
00:02:24.160 let me give an example of some RBS files
00:02:26.239 and how RBS in line improves upon that
00:02:28.879 so this is an example RBS file that
00:02:30.560 you'll note is called message.rbs like I
00:02:33.280 mentioned earlier the Ruby interpreter
00:02:34.720 will never see this code even though it
00:02:36.400 looks a lot like Ruby it's not actually
00:02:38.080 Ruby it just sort of follows the same
00:02:39.760 flavor you'll notice these methods don't
00:02:41.599 even have bodies they don't even have an
00:02:42.800 end keyword at the end instead it uses
00:02:45.599 this colon to denote the start of a type
00:02:48.239 annotation so we have some attribute
00:02:50.160 readers here for some attributes of the
00:02:51.519 message class that have a type user
00:02:53.440 string optional message and then
00:02:55.840 functions are annotated with this arrow
00:02:57.519 notation to say what arguments they take
00:02:59.200 and what return values they return now
00:03:02.159 you can imagine why this would be pretty
00:03:03.519 cumbersome to sort of keep in sync when
00:03:05.200 I rename my message class or perhaps I
00:03:07.120 call reply reply to I would have to
00:03:09.760 rename it in both my real Ruby code and
00:03:11.840 then update the RBS as well and that's
00:03:13.920 just not a good developer
00:03:16.280 experience inline RBS improves upon this
00:03:18.879 by taking those same signatures and
00:03:20.560 stuffing them into comments the Ruby
00:03:22.800 virtual machine just thinks that these
00:03:24.080 are regular old comments but static
00:03:25.760 analysis tooling can look for the
00:03:27.200 special colon that comes after the
00:03:28.640 hashtag to know that this isn't any old
00:03:30.480 comment in English this is a comment
00:03:32.159 that's meant to be in RBS format and so
00:03:34.799 now this would be the one file that you
00:03:36.159 would have in your repository you don't
00:03:37.440 need a separate RBS file and you can
00:03:40.080 have your real method implementations
00:03:41.280 you see now there's uh method bodies
00:03:43.680 with end keywords and so on and you can
00:03:46.159 see it's the same syntax as
00:03:48.200 before so why were we interested in RBS
00:03:51.599 we don't actually use the main type
00:03:53.599 checker for RBS steep at Shopify instead
00:03:56.959 we use Sorbet which has this really cute
00:03:59.439 cone mascot uh which is a static type
00:04:02.400 checker that's written in C++ so it's
00:04:04.239 blazingly fast it's also really highly
00:04:06.560 memory optimized so it scales to the
00:04:09.040 needs of our really huge
00:04:10.840 monolith sorbet is able to type check
00:04:13.439 about 9 million lines of our main
00:04:14.959 monolith in under a minute it's actually
00:04:16.880 really impressive on the performance
00:04:18.600 front so our team liked the Sorbet type
00:04:21.199 checker but we wanted to replace the
00:04:22.479 Sorbet SIG syntax the SIG syntax is a
00:04:25.759 Ruby DSL that is how Sorbet code
00:04:28.240 expresses the types instead of these RBS
00:04:30.240 comments and that's all you could use up
00:04:32.320 until today
00:04:34.080 because it's a Ruby DSL you it involves
00:04:36.639 actually calling methods that need to
00:04:38.080 exist at runtime and they're provided by
00:04:39.759 the sor survey runtime gem this was a
00:04:42.639 big limitation if you were a library
00:04:44.320 author and you wanted to use sorbet
00:04:46.000 signatures to improve the resilience and
00:04:47.600 robustness of your code you would not
00:04:49.840 only have to use survey runtime in your
00:04:51.759 own development of the gem but you would
00:04:53.600 also force all the consumers of the gem
00:04:55.199 to also depend on survey runtime
00:04:56.720 indirectly and that's just a too tall of
00:04:58.960 an order it's something we wanted to fix
00:05:02.240 also we found that signature sigs in the
00:05:04.639 sorbet syntax are really verbose and
00:05:06.639 clunky and let me show you an example of
00:05:10.120 that on the left here we have a point
00:05:12.639 class that is expressed uh that has its
00:05:14.960 types expressed using RBS comments and
00:05:17.039 on the right we have the exact same
00:05:18.639 material expressed using survey sigs
00:05:21.199 you'll notice that you have all these
00:05:22.320 method calls to sig which are passed a
00:05:24.080 block and within that block various DSL
00:05:26.320 methods are called like params and
00:05:27.759 returns given various type object as
00:05:29.680 parameters
00:05:31.039 the sig method has to actually be called
00:05:32.960 at runtime and that's where you see this
00:05:34.479 extend tig module being um extended and
00:05:38.400 this has a runtime representation that
00:05:39.840 we would like to get rid of ideally so
00:05:42.720 what did we want to do well we wanted
00:05:44.880 the brilliant like dragon ballsy fusion
00:05:47.840 of RBS and survey together we want to
00:05:50.720 have the sort of elegance and turseness
00:05:52.560 of the RBS syntax but we wanted to keep
00:05:54.560 Sorbet as the fast and performant
00:05:56.639 backend type checker that's going to
00:05:58.080 actually consume these uh comments ments
00:06:00.400 and do something with
00:06:02.759 them so that's exactly what we've done
00:06:06.319 here's a quick little demo where we have
00:06:07.680 a method called should return a string
00:06:09.840 it takes an integer it returns a string
00:06:11.840 as the survey sig very verbosely
00:06:14.800 explains and you see we just call i.2s
00:06:17.600 there's no complaints now if I change
00:06:20.000 this and I replace it just to be a
00:06:21.280 number you'll see immediately that the
00:06:22.479 survey LSP jumps in and says like "Hey
00:06:24.080 we expected a string but you gave me a
00:06:25.759 number." I'll change it back to a string
00:06:27.840 and we're okay again change it to a
00:06:29.520 number oh my god we can't have a number
00:06:31.039 here it needs to be a
00:06:32.759 string now if I remove this sorbet
00:06:35.600 signature and I replace it with an RBS
00:06:38.600 comment you'll see that when I change
00:06:40.479 the code
00:06:41.560 below we get the exact same type errors
00:06:43.919 as below
00:06:45.639 woohoo change it to a string it works
00:06:48.080 again the really great part is that we
00:06:50.319 don't need to extend TIG and we don't
00:06:52.400 even need to use the survey runtime gem
00:06:54.240 we can remove the requires and we can
00:06:55.680 remove it from our gym file because
00:06:57.120 we're no longer calling methods in a
00:06:58.560 Ruby code anymore and there's an open
00:07:00.880 pull request to merge our work which
00:07:02.960 we're hoping to have landed really soon
00:07:04.880 we're already sort of experimenting
00:07:06.000 internally with at
00:07:07.400 Shopify so what was so hard about this
00:07:09.919 what warrants making a conference talk
00:07:11.599 about the work that we've done here to
00:07:13.680 explain that I'm going to have to give a
00:07:15.360 really brief overview of the history of
00:07:17.199 of RBS and how it was sort of
00:07:20.120 developed before 2022 Sorbet was a sorry
00:07:24.720 not Sorbet RBS rbs was a pure Ruby gem
00:07:27.680 that had internal parser that was
00:07:29.120 written in Ruby that was accessed via a
00:07:32.160 regular old Ruby API you would call the
00:07:33.759 parse uh method on RBS
00:07:36.199 parser in
00:07:38.280 2022 Sutaro rewrote the internal
00:07:41.199 implementation of this parser to use a C
00:07:43.599 extension so now you still have the same
00:07:45.360 Ruby API but now implemented with a C
00:07:47.520 extension in the background
00:07:49.759 uh of critical importance is that this
00:07:52.080 CC code still makes very heavy use of
00:07:54.560 Ruby VM features from the C extension
00:07:56.560 API and that's something that we're
00:07:58.240 going to need to address and I'll get
00:07:59.360 into
00:08:00.120 why our work was to take this a step
00:08:02.800 further and make it so that this is a
00:08:04.879 pure C parser that does not use the Ruby
00:08:06.960 extension APIs at all so while the Ruby
00:08:10.639 VM still exists like it sorry the Ruby
00:08:12.720 API exists like it was before and Ruby
00:08:14.800 apps can call it we now have this new C
00:08:19.160 API so let's look at what this achieves
00:08:23.199 existing users of RBS such as Ruby LSP
00:08:26.000 which is the tool that powers the rich
00:08:27.520 edit editor integrations into your IDE
00:08:30.160 and type checkers like Steep can
00:08:32.080 continue to use RBS as if it were a
00:08:34.240 regular Ruby gem and they'll just call
00:08:35.839 regular method uh Ruby methods but what
00:08:38.880 we've unlocked is for tools that don't
00:08:40.560 want to necessarily call Ruby code to be
00:08:42.640 able to make use of RBS as a library and
00:08:44.959 so for example we'll have a new type
00:08:46.959 checker or a different type checker like
00:08:48.399 Sorbet which is written in C++ and even
00:08:51.760 alternative Ruby runtimes like J Ruby
00:08:53.680 can make use of this and they can just
00:08:56.240 make calls to the C code and the J Ruby
00:08:59.600 is uh example is a particularly good one
00:09:01.680 where J Ruby can handle Ruby gems that
00:09:04.640 just have Ruby code it can also make
00:09:06.320 foreign function interface calls into C
00:09:08.160 libraries but Ruby extensions uh sorry
00:09:11.120 Ruby uh gems with C extensions are this
00:09:13.680 middle ground that it can't support and
00:09:15.440 we fixed that for this context
00:09:19.360 so why was the Ruby dependency a problem
00:09:22.320 well the first problem was that using
00:09:24.160 any of these Ruby features of the Ruby
00:09:25.600 VM requires holding the global VM lock
00:09:28.240 when you do that you limit your CPU
00:09:30.080 parallelism to just a single thread
00:09:31.600 unless you start to incorporate ractors
00:09:34.000 and this was going to be a really big
00:09:35.440 issue for us because one of the ways
00:09:37.120 that survey is able to be so fast is
00:09:38.480 that it can parse all your files in
00:09:40.000 parallel on multiple threads within the
00:09:42.240 same process if we're using RBS as a
00:09:45.040 Ruby gem the JVI would just limit the
00:09:47.440 the parallelism of that down to just a
00:09:49.120 single core doing real work at a
00:09:51.560 time another thing to note is that like
00:09:53.760 I mentioned earlier MRI's C extension
00:09:55.519 API is not portable and would preclude
00:09:57.920 alternative Ruby runtimes from be able
00:09:59.519 to use the API and lastly there's also a
00:10:02.480 peripheral benefit around performance
00:10:04.480 where with CC code we have a bit more
00:10:05.920 control over the memory layout of things
00:10:07.680 and we can tune things to use less
00:10:09.440 memory and less CPU
00:10:12.320 so if there's all these benefits one
00:10:14.079 might ask you know why was RBS using
00:10:17.040 Ruby in the first place inside of its
00:10:19.120 rewritten C parser why didn't it just
00:10:22.079 use pure C code from the beginning well
00:10:24.320 the reason is that the Ruby VM is really
00:10:25.839 convenient it gives us a lot of really
00:10:27.360 great features that we're going to have
00:10:29.200 to give up and
00:10:30.440 replace the first of these is air
00:10:32.560 handling via exceptions c doesn't have
00:10:34.720 exceptions so we're just going to need
00:10:36.399 to do something else the next is memory
00:10:39.440 management so the Ruby VM provides a
00:10:41.920 dynamic memory allocator and a garbage
00:10:43.760 collector that will take care of freeing
00:10:45.680 memory for you when you're done using it
00:10:47.920 and so we're going to need to do
00:10:48.880 something about that and there's two
00:10:50.720 more points that I'll mention that I
00:10:51.760 won't elaborate as much the first that
00:10:54.079 you might be familiar with is that the
00:10:55.600 Ruby standard library provides us with a
00:10:57.040 lot of really convenient classes and
00:10:58.399 data structures you can think of array
00:11:00.000 hash there's cues and so on so we're
00:11:02.480 going to need to find replacements for
00:11:04.079 those one suggestion that I'll make here
00:11:06.240 is that we should resist the temptation
00:11:08.240 to just hand roll all our own data
00:11:09.760 structures i would really recommend
00:11:11.360 trying to use well- tested high
00:11:13.200 performance container libraries um that
00:11:15.519 exist for CC code and lastly I'll
00:11:18.160 mention that Ruby as a object-oriented
00:11:20.399 programming language gives us
00:11:21.839 polymorphism we can have methods that
00:11:23.519 are dynamically dispatched so that when
00:11:25.360 we send a message to an object the Ruby
00:11:27.440 VM will automatically figure out for us
00:11:29.200 what's the class of this object and
00:11:30.560 what's the correct implementation of
00:11:32.560 that message for that
00:11:34.360 object so let's dive into the first two
00:11:36.720 of those in detail the first Ruby
00:11:38.800 convenience is error
00:11:41.240 handling so edmatic Ruby code handles
00:11:44.000 errors via exceptions and this is really
00:11:46.880 great when we raise an exception it's
00:11:48.560 like a kind of deep return like a return
00:11:50.880 statement we're going to stop the
00:11:52.240 execution of a current function but
00:11:54.240 unlike a return statement it has this
00:11:55.600 deep quality where we can actually jump
00:11:56.959 multiple layers of the call stack if if
00:11:59.600 we will interrupt the parent function
00:12:01.519 and the parents parent function and so
00:12:03.200 on all the way up until we reach the
00:12:04.959 most recent rescue block it's like a
00:12:06.720 kind of teleportation that we can do the
00:12:09.200 Ruby C API exposes these with APIs like
00:12:11.680 RB rays and RB rescue so let's see an
00:12:14.079 example of how that might have been used
00:12:15.279 and what we'll do instead here we have a
00:12:18.800 a fictional function called parse
00:12:20.720 optional tupil a tupil in RBS is just a
00:12:23.360 fixed length array of potentially
00:12:25.120 different types so for example you might
00:12:27.040 have a tupil of a string an integer and
00:12:28.639 a bool that could be packed together an
00:12:30.720 optional tupil is just a type that
00:12:32.320 describes either those things or nil so
00:12:34.480 you could have either value uh be passed
00:12:37.040 to a parameter of this type so we'll see
00:12:39.519 that to parse an optional tupil well
00:12:41.120 first we're going to parse a tupil and
00:12:42.800 then we're going to see is it followed
00:12:44.000 by a question mark if it is then we have
00:12:45.760 an optional tupil if it's not we have a
00:12:48.240 just a regular tupil looking at the
00:12:50.720 implementation of parse tupil we'll see
00:12:52.480 that one of the first things we're going
00:12:53.519 to want to do is assert that we have an
00:12:55.920 opening square brace as the first
00:12:57.279 character if we don't then this isn't a
00:12:59.279 tupil and there's just some kind of
00:13:00.560 syntactic error and so this RB rays over
00:13:03.279 here that we do that raise a syntax
00:13:04.800 error well that's a Ruby feature and we
00:13:06.720 won't be able to use that so how do we
00:13:09.920 throw exceptions without Ruby that's the
00:13:13.360 cool part uh you don't we're just going
00:13:15.360 to need to figure out something
00:13:16.079 completely different here so we know
00:13:18.399 that RB race has got to go we can't keep
00:13:20.800 that what we'll do instead is we'll
00:13:22.399 return false like I mentioned you could
00:13:24.560 think of raising exceptions as a kind of
00:13:26.399 deep return so we could just return a
00:13:28.880 bunch ourselves so we're going to return
00:13:31.600 false in this case and false means that
00:13:33.440 an operation failed likewise if we get
00:13:36.000 to the end of this function and we
00:13:37.360 succeed we're going to want to return
00:13:38.880 true and true says that the operation
00:13:41.079 succeeded but to do this to be able to
00:13:43.279 return booleans where previously we
00:13:44.639 would be returning these RBS nodes we're
00:13:47.040 going to have to change the signatures
00:13:49.279 which is going to be a little
00:13:50.000 inconvenient but we can go through
00:13:51.880 it so we'll take the RBS node pointer
00:13:54.880 return types and we're going to actually
00:13:56.560 move them into the parameter list as out
00:13:59.040 parameters and the actual return types
00:14:01.360 are going to be bool what this means is
00:14:03.199 that these functions are going to return
00:14:04.160 a value not by actually returning it but
00:14:06.240 by storing it into a location of your
00:14:07.839 choosing when we do this we'll no longer
00:14:10.560 be able to call parse tupil in the way
00:14:11.920 that we do near the top here because its
00:14:13.279 return value is just going to be a
00:14:14.160 boolean not the the node that we're
00:14:15.839 looking for so we're going to move it
00:14:17.279 onto its own line and we're going to
00:14:19.279 pass to it the local variable into which
00:14:21.360 we want pars tupil to store its result
00:14:24.160 but now we have to ask the question did
00:14:25.680 this succeed and to know that we have to
00:14:28.160 check the boolean that was returned uh
00:14:30.000 when this function was invoked so we'll
00:14:32.560 wrap this in an if statement and we'll
00:14:34.720 say if not parse tupil so if the result
00:14:36.880 is false then we know that we have an
00:14:39.279 air condition and what do we want to do
00:14:40.720 in this case well we're going to return
00:14:41.920 false which is a sort of a form of
00:14:44.000 manually bubbling up an air so a key
00:14:47.519 note here is that air propagation is
00:14:49.760 manual and it's kind of inconvenient but
00:14:51.839 that's just what we're going to have to
00:14:52.800 do
00:14:53.480 here now to fill in a few more details
00:14:55.839 you'll notice that well we can only
00:14:57.760 return booleans from these functions so
00:14:59.360 this long return value here is going to
00:15:01.440 be an issue so what are we going to do
00:15:03.600 we're going to assign it to the our our
00:15:05.440 out parameters going to move over that
00:15:08.160 expression and we're going to return
00:15:09.680 true for success at the end
00:15:11.480 here and so that's an example of how the
00:15:13.839 error management in one of these
00:15:15.279 libraries might look like next I'm going
00:15:18.320 to talk about the second convenience
00:15:19.440 that Ruby provides us which is memory
00:15:20.880 management and I'll talk about it
00:15:22.720 through an example so suppose we have a
00:15:24.880 foo which is some kind of syntactic
00:15:26.560 construct that
00:15:27.800 involves to parse a foo which involves
00:15:30.240 parsing a then parsing B then parsing C
00:15:32.880 these are just all examples now if we
00:15:35.839 get midway through the process and we
00:15:37.279 encounter syntax error there's no reason
00:15:39.519 for us to keep trying to parse the whole
00:15:41.120 rest of it we want to bail early so
00:15:43.360 let's add some early guard clauses that
00:15:45.279 will return if any one of these steps
00:15:47.880 fail and let's briefly walk through the
00:15:50.320 happy path of what would happen when we
00:15:52.160 call this code to do so I'll bring up a
00:15:54.639 little representation of a runtime stack
00:15:56.240 on the right here and we're going to
00:15:57.440 step through it like a debugger so when
00:15:59.600 we enter our function we have an
00:16:01.120 execution pointer that's going to look
00:16:02.240 at the start of our first line and we
00:16:04.720 have a call stack or a stack frame
00:16:06.079 allocated on our call stack which has
00:16:07.680 uninitialized variables with local for
00:16:10.240 local variables called A B and
00:16:13.079 C when we call parse A internally it's
00:16:16.079 going to like parse our text it's going
00:16:17.759 to allocate some result object A it's
00:16:19.680 going to return the pointer to it and
00:16:21.519 we're going to assign that assign that
00:16:22.800 pointer into our local variable A
00:16:25.040 similarly we'll call some function parse
00:16:26.959 B it's going to parse some stuff capture
00:16:29.120 that data in a B object and we'll point
00:16:30.800 to it lastly we'll do the same for an
00:16:32.800 object
00:16:33.720 C so all this parsing succeeded and what
00:16:36.639 we want to do is package up this data
00:16:38.600 together so at this point ownership over
00:16:41.600 these objects belongs to this the stack
00:16:43.600 frame of our current function call but
00:16:46.000 when we create a new foo object we're
00:16:48.560 it's going to have pointers to those
00:16:49.920 three objects and we return when we
00:16:52.240 return the fu object out of our current
00:16:54.920 um function call our stack frame is
00:16:57.680 going to disappear and with it the
00:16:59.519 pointers that we had to those objects
00:17:02.079 this all works great whoever is going to
00:17:04.079 own the foo object after this function
00:17:05.600 call is going to be responsible for not
00:17:07.199 just freeing the foo but also freeing
00:17:09.600 its constituent a b and
00:17:11.400 c but now let's see how this can go
00:17:14.240 wrong if you have an error again we're
00:17:16.240 going to enter our function allocate a
00:17:17.919 stack frame and we're going to step
00:17:19.360 through it suppose parsing the A
00:17:22.000 succeeds we're going to get some A
00:17:23.360 object on the heap and we're going to
00:17:25.039 point to it with our A local variable
00:17:27.039 but now let's see let's say that there
00:17:28.480 is a character that's out of place for
00:17:30.160 our B value that we don't even allocate
00:17:33.200 it because we just can't we don't know
00:17:35.840 what the syntax is that we're trying to
00:17:37.200 parse there it's going to return null
00:17:39.600 we're going to store that in a local
00:17:40.640 variable and on the the next line on the
00:17:42.880 if statement we're going to decide to
00:17:44.080 return early at this point our stack
00:17:46.720 frame is destroyed the local variable A
00:17:49.200 stops existing and our object A is still
00:17:52.559 there we haven't freed it so this object
00:17:55.840 exists on the heap but there's no way
00:17:57.440 for us to access it anymore not either
00:17:59.679 to use it or to get rid of it it's
00:18:01.440 almost like a helium balloon if you hold
00:18:03.039 onto the string of a helium balloon it's
00:18:04.640 yours you can use it you can have fun
00:18:06.480 and you can throw it out but if you let
00:18:07.919 go of the handle of your helium balloon
00:18:09.200 and you have no other strings attached
00:18:10.559 to it it's going to fly away and you
00:18:12.400 lose control over it this is what we
00:18:15.280 would call memory leak and so a has
00:18:17.720 leaked now if you just leak one object
00:18:20.160 occasionally it's not great but it's not
00:18:22.400 the worst but it's interesting to note
00:18:24.320 how this works in the case of RBS so you
00:18:27.760 might just start writing some partial
00:18:29.679 RBS code because uh you know your code
00:18:32.640 isn't always syntactically valid as
00:18:34.160 you're typing it you're you're starting
00:18:35.520 on the left and you're typing character
00:18:36.799 by character and almost every time that
00:18:38.640 your ID is going to parse this code it's
00:18:40.240 not going to be valid so we might start
00:18:41.840 with an empty comment then we might open
00:18:43.760 a brace but that's that that just leaked
00:18:46.080 some objects and then we're going to say
00:18:47.280 well there's a parameter x and it's an
00:18:48.799 integer and we're leaking objects and
00:18:50.640 there's also y and oh god there's a lot
00:18:52.559 of objects and it's going to return void
00:18:54.960 and there's more objects and eventually
00:18:57.440 your IDE will get uh killed by the
00:18:59.520 operating system because you use too
00:19:00.720 much memory if your leaks are too bad we
00:19:03.600 don't want this so how do we fix it well
00:19:06.559 the manual way to do it is going to need
00:19:08.400 a bit more room so I'll move this code
00:19:09.919 over and I'll add some more lines for us
00:19:11.840 to use
00:19:13.280 and we are going to need to add these
00:19:14.880 free calls so let's see how they help if
00:19:17.679 we just look at the first segment of
00:19:18.799 this where we parse an A we don't have
00:19:21.919 anything to clean up if parsing A fails
00:19:24.640 then there's no other objects we need to
00:19:26.160 free so we could just return null
00:19:27.440 there's nothing to do here if parsing A
00:19:30.320 succeeded and we got to parsing B and
00:19:32.640 the B failed well when the B fails we've
00:19:35.280 already successfully allocated the A and
00:19:37.520 so we're going to need to free just the
00:19:39.039 A but not the B now supposing parse B
00:19:42.320 succeeded now there's the free A
00:19:44.320 supposing parse B succeeded and we moved
00:19:46.080 on to parsing the C well at this point
00:19:48.559 the A and the B were allocated but not
00:19:50.400 the C so we're going to need two free
00:19:51.799 statements and so you can see here is
00:19:53.840 that the number of different free calls
00:19:55.360 that you have to add scales pretty
00:19:57.360 unfortunately with the number of exit
00:19:58.960 points and local variables that you have
00:20:00.799 in your function and because we use
00:20:02.880 early returns as a way of modeling
00:20:04.640 errors that introduces even more areas
00:20:06.880 where we early return and need to do
00:20:09.520 manual cleanup and so not only is this
00:20:11.440 quite unpleasant to program but it's
00:20:12.880 really errorprone and you risk either
00:20:14.720 forgetting to free a resource which
00:20:16.080 means you have memory leaks and
00:20:17.280 potentially eventually out of memory
00:20:19.080 crashes or potentially you double free
00:20:21.760 the same object you free it twice and
00:20:24.160 you have undefined behavior at best you
00:20:26.880 might have a crash immediately at worst
00:20:28.559 you have you might have suddenly wrong
00:20:30.000 behavior in your library and it's going
00:20:31.440 to be really hard to
00:20:33.080 debug so to fix this let's talk about
00:20:35.360 memory allocators now the CS standard
00:20:38.000 library provides a memory allocator that
00:20:39.600 you must you're probably quite familiar
00:20:41.039 with it's exposed through maloc and free
00:20:44.559 maloc has a deceptively appealing and
00:20:47.039 simple looking interface you give it a
00:20:49.679 number of bytes of memory that you want
00:20:51.360 and it returns back either null if
00:20:53.360 there's no memory or a pointer to a
00:20:55.600 chunk of memory with at least that many
00:20:57.280 bytes available for you to use you use
00:20:59.760 it for as long as you'd like and when
00:21:01.200 you're finished with that memory you
00:21:02.400 pass that same pointer to free and
00:21:04.080 you're done with it
00:21:06.320 now the problem here is that object
00:21:07.840 lifetimes are just too granular like you
00:21:09.679 saw in the previous example we have all
00:21:11.440 these sort of related allocated objects
00:21:13.520 that are sort of spawned around the same
00:21:14.960 time but we have to deal with them on
00:21:16.480 this really fine grained way which is
00:21:18.159 errorprone and it also has a performance
00:21:20.320 cost because we're making so many
00:21:21.760 separate calls to free and if there's
00:21:24.000 one takeaway that I can drive here is
00:21:25.520 that even though maloc and free is
00:21:27.280 universal across the C standard library
00:21:29.760 and it's so approachable it actually
00:21:31.919 makes it really difficult to make um
00:21:34.400 sort of uh leak free and robust code so
00:21:38.559 the core idea that we want instead is to
00:21:40.240 think what if we group objects of
00:21:42.240 similar lifetimes together we want
00:21:44.640 something that lets us allocate objects
00:21:46.320 at similar but different times but then
00:21:49.440 we can free them all at once so a really
00:21:51.679 simple example of this is uh if you have
00:21:53.520 a Super Mario World as you start world
00:21:56.559 one you'll go through some objects will
00:21:58.880 get allocated for the glmbbas and the
00:22:00.559 shells and the level and the Mario
00:22:02.559 himself but when you go down the exit
00:22:04.320 pipe everything for that level can be
00:22:05.919 deleted you don't have to worry about is
00:22:07.360 it the first gloomba or the third one
00:22:09.280 everything can go at the same time
00:22:11.760 similarly in our RBS case we allocate a
00:22:14.080 new RBS parser for every single
00:22:16.159 signature that we parse and whether we
00:22:18.400 succeed or fail we could just clear the
00:22:20.320 whole thing at the very end and there's
00:22:21.919 nothing to worry about so that's an
00:22:24.480 example of how these two sort of
00:22:25.840 standard and custom allocators could
00:22:27.360 look
00:22:28.120 like on the left here you'll see these
00:22:30.320 calls to maloc where we ask it to
00:22:32.000 allocate enough memory for the size of
00:22:33.520 an integer a character and some fu t
00:22:36.679 structure it will return back the
00:22:38.640 pointers that we wanted and we can call
00:22:40.240 some do thing function you know the
00:22:41.919 proxy for the work that we want to do
00:22:43.280 with those objects at the end we'll free
00:22:45.440 each of them individually and so you see
00:22:47.440 those are quite a few function calls
00:22:48.799 internally do they could do a fair bit
00:22:50.320 of work and so there's a performance
00:22:51.600 cost to that and it's easy to miss one
00:22:53.440 or get it wrong on the right we have
00:22:56.240 similar code that does almost the same
00:22:58.640 thing but it uses this custom allocator
00:23:00.159 that we call an RBS allocator we
00:23:02.480 initialize an allocator with a
00:23:04.240 predefined amount of memory that we know
00:23:06.000 is going to be enough in this case 4
00:23:07.880 kilobyt and when we call RBS a lock we
00:23:11.280 pass in the instance of the allocator
00:23:12.960 with which we want to allocate the
00:23:14.159 memory and again it will return pointers
00:23:16.880 to us that we use almost in the same way
00:23:18.400 as maloc we do something with them but
00:23:20.960 crucially when we finish with them we
00:23:22.640 don't free those pointers individually
00:23:24.400 instead we could just wipe away the
00:23:25.919 whole allocator in one shot and so you
00:23:28.159 can imagine you have an array of
00:23:29.200 thousands of objects and rather than
00:23:30.880 looping through the array and freeing
00:23:32.000 each one independently just drop the
00:23:33.840 whole thing so let's visualize what
00:23:36.320 happens inside one of these allocators
00:23:37.840 you'll hear the term arena allocation or
00:23:39.600 also perhaps slab allocation it's a
00:23:41.520 similar idea but it starts with getting
00:23:44.080 this big piece of memory and having a
00:23:46.080 pointer that just keeps track of where
00:23:47.440 your most free recent free slot is every
00:23:50.559 time your an allocation is requested
00:23:52.640 you're just going to put it where the
00:23:54.159 free spot was and move the free spot
00:23:55.760 along and so we might allocate some
00:23:57.600 objects A X B and they're all going to
00:24:00.960 live in here now for our allocator we
00:24:04.400 chose to make it really simple because
00:24:05.760 we know that RBS parsers are very
00:24:07.360 shortlived they're allocated and dropped
00:24:10.159 within like a millisecond so we don't
00:24:12.640 have to worry about doing any
00:24:13.679 complicated work about freeing objects
00:24:15.840 and finding holes to allocate them in so
00:24:17.919 if somebody stopped finishing stopped
00:24:19.760 using C or sorry X we could just leave
00:24:23.039 it as it is that will be unused memory
00:24:24.880 it's going to be wasted for the lifetime
00:24:26.159 of the allocator but because this one
00:24:27.760 isn't going to live so long we don't
00:24:28.960 have to worry about it now when we
00:24:30.640 allocate C you see that we don't
00:24:32.320 actually fill C where X was we'll just
00:24:34.000 keep allocating at the end it's simpler
00:24:35.600 that way so we can keep on allocating
00:24:38.240 here we have some object Y maybe we
00:24:40.000 finish with Y here's our foo maybe we'll
00:24:43.120 have some other objects to allocate and
00:24:44.559 so
00:24:45.400 on now a cool thing here is that these
00:24:47.840 this allocator lets us have objects that
00:24:50.480 point amongst each other in any which
00:24:52.159 way so like before our fu can own an A a
00:24:54.880 B and a C and these pointers can have
00:24:57.600 cycles they can be transitive like we
00:24:59.760 don't have to care the really cool part
00:25:01.600 is that when we're done with it we could
00:25:03.840 just drop the whole thing we don't have
00:25:06.000 to chase the graph make sure we don't
00:25:07.919 get caught in loops we just wipe away
00:25:09.799 everything there's a risk to it though
00:25:11.840 if I bring revive back that
00:25:14.520 memory i showed you here pointers that
00:25:17.039 are within uh between objects within the
00:25:19.760 same arena but if you have objects from
00:25:22.159 one arena pointing to objects in another
00:25:23.760 arena or for example if you have local
00:25:25.760 variables or global variables pointing
00:25:27.200 into an arena you'll have these external
00:25:29.279 references from outside that point into
00:25:30.880 it and if you drop your arena at the
00:25:33.919 wrong time without cleaning those
00:25:37.240 up you get dangling pointers any attempt
00:25:40.240 to read or write from these memory
00:25:41.679 locations is going to blow up again in
00:25:44.240 the best case it's going to crash your
00:25:45.440 program immediately in the worst case
00:25:47.039 you're going to have really subtly
00:25:48.080 incorrect behavior it's going to be hard
00:25:49.440 to
00:25:50.360 debug now let's look at how this lets us
00:25:52.640 write the code that we wanted we have a
00:25:54.640 similar example to before where we have
00:25:56.960 all these early returns but within the
00:25:59.360 parse A B and C functions we'll note
00:26:01.600 that everything is allocated owned by a
00:26:03.919 particular allocator for for this
00:26:05.440 current parser and because that parser
00:26:08.240 owns all those objects and they'll all
00:26:10.400 be freed at the end whether we succeed
00:26:11.919 or we fail we don't have to worry about
00:26:13.760 any manual cleanups on the early return
00:26:15.520 paths here and so this is much easier to
00:26:18.799 get right there's nothing to do
00:26:21.440 there's another downside that I'll
00:26:22.760 mention which is that when you have
00:26:25.679 functions that need to allocate they
00:26:27.279 need an instance of the allocator to do
00:26:28.880 to allocate with and so here's some
00:26:30.720 example functions we have a RBS hash new
00:26:32.960 RBS list new some function to unquote a
00:26:35.440 string and descript space off a string
00:26:37.840 well for them to be able to allocate
00:26:39.400 internally we're going to have to add a
00:26:41.200 new parameter and plumbing these around
00:26:42.960 can be kind of annoying basically almost
00:26:44.640 every function in your system is going
00:26:45.840 to take one of these and almost every
00:26:47.520 function call is going to involve
00:26:48.559 passing it it's a little tedious but
00:26:51.200 there's a kind of a cool upside which is
00:26:52.720 that at a glance I can tell you that RBS
00:26:54.799 string strip whites space doesn't
00:26:56.960 allocate i can tell you that it's not
00:26:58.960 going to return me a copied string of
00:27:01.039 the the whites space trimmed off it's
00:27:02.880 going to give me a string that points to
00:27:04.559 the same memory with the the start and
00:27:07.360 the end chopped off without doing any
00:27:09.440 memory allocation or copying that's
00:27:11.440 pretty
00:27:13.159 awesome so in summary what's the big
00:27:15.600 deal about extracting a C library from a
00:27:17.600 Ruby gem why would I want to do it well
00:27:19.600 pure C libraries are just more portable
00:27:22.000 than Ruby gems that contain C
00:27:24.919 extensions because we're not calling
00:27:26.720 Ruby anymore we don't have to worry
00:27:28.080 about the global VM lock and we could
00:27:30.000 just use any kind of threading that we
00:27:31.440 want without being limited by
00:27:33.320 it i would recommend that you establish
00:27:35.440 a consistent pattern around air handling
00:27:37.360 early on in the process like I showed
00:27:39.360 it's going to mean that for example your
00:27:40.559 functions turn booleans and that's going
00:27:42.400 to impact your API design and it's
00:27:44.080 pretty hard to retrofit that once you've
00:27:45.760 already built up a big system
00:27:48.080 similarly I suggest you think about
00:27:49.440 lifetimes really early in your design
00:27:51.039 and implementation because you're going
00:27:53.760 to need to pass around allocators and it
00:27:55.200 can be really frustrating to decide oh
00:27:57.120 actually I need an allocation here but
00:27:58.799 this function doesn't have an allocator
00:28:00.000 and the function that calls it doesn't
00:28:01.360 have an allocator and now you have to go
00:28:03.120 get your plumbing tools and start wiring
00:28:04.720 allocators through everything that could
00:28:06.320 be quite
00:28:07.320 difficult and again I recommend that you
00:28:09.919 embrace popular libraries for fast and
00:28:11.600 well tested containers there's a bunch
00:28:13.279 of those for C and uh try and make use
00:28:16.480 of those where you can to learn more
00:28:18.720 about the benefits of RBS and what you
00:28:21.039 can do with it i would recommend you
00:28:22.640 join my colleague Alexandra tomorrow in
00:28:24.559 this same hall at 14:00 for his talk
00:28:27.120 inline RBS comments for seamless typeing
00:28:29.120 with Sorbet in which you can learn more
00:28:31.200 about how Shopify is adopting RBS in our
00:28:33.520 code in our tooling and lastly I would
00:28:36.640 really like to thank Alexandra Stan Low
00:28:38.960 and Sutaro for their work in RBS and for
00:28:41.200 the great success we've had so far thank
00:28:43.120 you so much that's all folks
Explore all talks recorded at RubyKaigi 2025
+66