From C extension to pure C: Migrating RBS

RubyKaigi 2025

Play on YouTube

Edit

#developer-experience

#static-analysis

#memory-management

From C extension to pure C: Migrating RBS

Alexander Momchilov • April 17, 2025 • Matsuyama, Ehime, Japan • Talk

In his talk at RubyKaigi 2025, Alexander Momchilov discusses the migration of RBS (Ruby Signature) from a Ruby gem with a C extension to a pure C library. This transition aims to enhance performance, memory efficiency, and portability, making RBS usable with various Ruby tools such as Prism, Sorbet, JRuby, and TruffleRuby.

Key Points:

- Introduction to RBS: RBS is a standard notation for type annotations in Ruby, used for static type checking without being executed by the Ruby VM.

- Why Migrate to Pure C: The original RBS depended on Ruby VM features which limited portability and multi-threading capabilities, particularly due to the Global VM Lock (GVL). Moving to pure C allows multiple threads to work in parallel.

- Maintaining Developer Experience: Inline RBS—where type annotations are added as comments within Ruby files—was introduced to improve developer experience over separate RBS files. This syncs easily with changing Ruby code.

- Performance Benefits: The pure C implementation is faster and more memory-efficient. The use of a C++ type checker (Sorbet) allows parsing multiple RBS files in parallel without being constrained by Ruby’s GVL.

- Handling C Programming Challenges: Transitioning to pure C necessitates addressing error handling, memory management, and the use of C-standard libraries instead of Ruby objects or exceptions.

Specific Examples Discussed: - Error Handling: C does not have built-in exception handling like Ruby. The solution involves returning boolean values to indicate success or failure, with error propagation handled manually.

- Memory Management: Switching from Ruby's garbage collection to manual memory management required implementing custom allocators that allow for simpler object lifecycle management and bulk freeing of objects.

- Conclusion and Takeaways: The presentation highlights the benefits of using a pure C model, including improved portability across different Ruby environments and greater control over performance. Developers are encouraged to adopt careful practices around error handling and memory management to maximize the performance benefits of using C libraries in Ruby.

From C extension to pure C: Migrating RBS
Alexander Momchilov • Matsuyama, Ehime, Japan • Talk

Date: April 17, 2025
Published: May 27, 2025
Announced: unknown

Learn how we migrated RBS to remove its dependency on the Ruby VM and expose a new C API. In addition to being faster and more memory-efficient, it's now more portable: tools like Prism, Sorbet, JRuby and TruffleRuby will be able to use RBS directly. Type checkers like Steep and Sorbet will now be able to parse multiple RBS files in parallel, unconstrained by the GVL.

The Ruby VM offers many luxuries that can help ease C extension development, such as garbage collection, exceptions, and the many built-in data structures like `Array` and `Hash`. Unfortunately, to be maximally portable and multi-threaded, some C extensions like RBS and Prism will need to forego these conveniences. We'll show techniques for replicating them in pure C.

Join us to explore advanced techniques in writing C extensions and see how this universal RBS parser paves the way for improved tooling and collaboration in the Ruby ecosystem.

https://rubykaigi.org/2025/presentations/amomchilov.html

RubyKaigi 2025

00:00:04.480 all right good afternoon everyone

00:00:06.160 welcome to coming to my first ever

00:00:08.320 conference talk it's called from C

00:00:10.240 extension to pure C it's a story of how

00:00:12.639 my team and I migrated RBS from being a

00:00:15.280 Ruby gem with a C extension to a Pure C

00:00:19.000 library let's jump into it first I'll

00:00:21.520 introduce myself my name is Alexander

00:00:23.039 Mumilof i'm a staff developer at the

00:00:25.119 Ruby developer experience team at

00:00:26.599 Shopify i flew in from Toronto Canada

00:00:30.080 and I'm actually one of three different

00:00:31.920 presentations that my team is giving

00:00:33.360 this year about type checking and static

00:00:35.800 analysis yesterday my co-orker Venicia

00:00:38.399 Stock presented embracing Ruby magic

00:00:40.559 statically analyzing uh DSLs and that

00:00:43.760 was yesterday so you can either jump in

00:00:45.200 your time machine go back and see it

00:00:47.120 then or you can just wait about a month

00:00:48.719 to see the recording there's this talk

00:00:51.440 now and tomorrow my co-orker Alexandre

00:00:53.920 is presenting inline RBS comments for

00:00:56.320 seamless type checking with Sorbet and

00:00:58.160 that's in the same hall tomorrow at

00:01:00.120 14:00 and so yes there's actually two

00:01:02.559 different Alex's presenting RBS related

00:01:04.960 talks just for my team alone and if your

00:01:07.680 name is also Alex we're hiring you can

00:01:09.760 join and make an RBS talk of your own no

00:01:12.880 just kidding any name can apply shopify

00:01:15.040 is always looking for exceptional talent

00:01:16.640 so if you're interested please scan the

00:01:18.000 QR code i'll leave it up for a moment

00:01:19.439 longer

00:01:20.720 but let me get into what is RBS and what

00:01:22.720 are we going to talk about today so in

00:01:24.799 short RBS stands for Ruby

00:01:27.000 signature and it's a standard notation

00:01:29.119 for describing the types of various

00:01:30.960 constructs in your code so you can

00:01:32.799 annotate what you expect the types of

00:01:34.479 your local variables instance variables

00:01:36.640 method return values parameters and so

00:01:38.799 on to be at

00:01:40.439 runtime originally RBS was written in

00:01:43.200 separate RBS files these are files that

00:01:45.840 aren't executed by the Ruby virtual

00:01:47.439 machine they're not even seen by the

00:01:48.880 interpreter at all instead they're sort

00:01:51.520 of exist in parallel in your same repo

00:01:53.920 for static analysis tooling to read not

00:01:56.000 for actual Ruby code to

00:01:57.799 execute inline RBS was a sort of

00:02:00.399 development that let you write your the

00:02:02.640 same RBS signatures that you had in your

00:02:04.399 RBS files but now inline in your RB

00:02:07.119 files into comments i'll show you

00:02:09.759 examples of both of these when your

00:02:11.680 programs have RBS signatures in them you

00:02:13.520 can use a type checker to automatically

00:02:15.040 validate the correctness of your program

00:02:16.720 and to make sure that all the types line

00:02:18.480 up if you're passing a string to some

00:02:20.319 function that expects an integer it will

00:02:21.920 tell you that something went wrong so

00:02:24.160 let me give an example of some RBS files

00:02:26.239 and how RBS in line improves upon that

00:02:28.879 so this is an example RBS file that

00:02:30.560 you'll note is called message.rbs like I

00:02:33.280 mentioned earlier the Ruby interpreter

00:02:34.720 will never see this code even though it

00:02:36.400 looks a lot like Ruby it's not actually

00:02:38.080 Ruby it just sort of follows the same

00:02:39.760 flavor you'll notice these methods don't

00:02:41.599 even have bodies they don't even have an

00:02:42.800 end keyword at the end instead it uses

00:02:45.599 this colon to denote the start of a type

00:02:48.239 annotation so we have some attribute

00:02:50.160 readers here for some attributes of the

00:02:51.519 message class that have a type user

00:02:53.440 string optional message and then

00:02:55.840 functions are annotated with this arrow

00:02:57.519 notation to say what arguments they take

00:02:59.200 and what return values they return now

00:03:02.159 you can imagine why this would be pretty

00:03:03.519 cumbersome to sort of keep in sync when

00:03:05.200 I rename my message class or perhaps I

00:03:07.120 call reply reply to I would have to

00:03:09.760 rename it in both my real Ruby code and

00:03:11.840 then update the RBS as well and that's

00:03:13.920 just not a good developer

00:03:16.280 experience inline RBS improves upon this

00:03:18.879 by taking those same signatures and

00:03:20.560 stuffing them into comments the Ruby

00:03:22.800 virtual machine just thinks that these

00:03:24.080 are regular old comments but static

00:03:25.760 analysis tooling can look for the

00:03:27.200 special colon that comes after the

00:03:28.640 hashtag to know that this isn't any old

00:03:30.480 comment in English this is a comment

00:03:32.159 that's meant to be in RBS format and so

00:03:34.799 now this would be the one file that you

00:03:36.159 would have in your repository you don't

00:03:37.440 need a separate RBS file and you can

00:03:40.080 have your real method implementations

00:03:41.280 you see now there's uh method bodies

00:03:43.680 with end keywords and so on and you can

00:03:46.159 see it's the same syntax as

00:03:48.200 before so why were we interested in RBS

00:03:51.599 we don't actually use the main type

00:03:53.599 checker for RBS steep at Shopify instead

00:03:56.959 we use Sorbet which has this really cute

00:03:59.439 cone mascot uh which is a static type

00:04:02.400 checker that's written in C++ so it's

00:04:04.239 blazingly fast it's also really highly

00:04:06.560 memory optimized so it scales to the

00:04:09.040 needs of our really huge

00:04:10.840 monolith sorbet is able to type check

00:04:13.439 about 9 million lines of our main

00:04:14.959 monolith in under a minute it's actually

00:04:16.880 really impressive on the performance

00:04:18.600 front so our team liked the Sorbet type

00:04:21.199 checker but we wanted to replace the

00:04:22.479 Sorbet SIG syntax the SIG syntax is a

00:04:25.759 Ruby DSL that is how Sorbet code

00:04:28.240 expresses the types instead of these RBS

00:04:30.240 comments and that's all you could use up

00:04:32.320 until today

00:04:34.080 because it's a Ruby DSL you it involves

00:04:36.639 actually calling methods that need to

00:04:38.080 exist at runtime and they're provided by

00:04:39.759 the sor survey runtime gem this was a

00:04:42.639 big limitation if you were a library

00:04:44.320 author and you wanted to use sorbet

00:04:46.000 signatures to improve the resilience and

00:04:47.600 robustness of your code you would not

00:04:49.840 only have to use survey runtime in your

00:04:51.759 own development of the gem but you would

00:04:53.600 also force all the consumers of the gem

00:04:55.199 to also depend on survey runtime

00:04:56.720 indirectly and that's just a too tall of

00:04:58.960 an order it's something we wanted to fix

00:05:02.240 also we found that signature sigs in the

00:05:04.639 sorbet syntax are really verbose and

00:05:06.639 clunky and let me show you an example of

00:05:10.120 that on the left here we have a point

00:05:12.639 class that is expressed uh that has its

00:05:14.960 types expressed using RBS comments and

00:05:17.039 on the right we have the exact same

00:05:18.639 material expressed using survey sigs

00:05:21.199 you'll notice that you have all these

00:05:22.320 method calls to sig which are passed a

00:05:24.080 block and within that block various DSL

00:05:26.320 methods are called like params and

00:05:27.759 returns given various type object as

00:05:29.680 parameters

00:05:31.039 the sig method has to actually be called

00:05:32.960 at runtime and that's where you see this

00:05:34.479 extend tig module being um extended and

00:05:38.400 this has a runtime representation that

00:05:39.840 we would like to get rid of ideally so

00:05:42.720 what did we want to do well we wanted

00:05:44.880 the brilliant like dragon ballsy fusion

00:05:47.840 of RBS and survey together we want to

00:05:50.720 have the sort of elegance and turseness

00:05:52.560 of the RBS syntax but we wanted to keep

00:05:54.560 Sorbet as the fast and performant

00:05:56.639 backend type checker that's going to

00:05:58.080 actually consume these uh comments ments

00:06:00.400 and do something with

00:06:02.759 them so that's exactly what we've done

00:06:06.319 here's a quick little demo where we have

00:06:07.680 a method called should return a string

00:06:09.840 it takes an integer it returns a string

00:06:11.840 as the survey sig very verbosely

00:06:14.800 explains and you see we just call i.2s

00:06:17.600 there's no complaints now if I change

00:06:20.000 this and I replace it just to be a

00:06:21.280 number you'll see immediately that the

00:06:22.479 survey LSP jumps in and says like "Hey

00:06:24.080 we expected a string but you gave me a

00:06:25.759 number." I'll change it back to a string

00:06:27.840 and we're okay again change it to a

00:06:29.520 number oh my god we can't have a number

00:06:31.039 here it needs to be a

00:06:32.759 string now if I remove this sorbet

00:06:35.600 signature and I replace it with an RBS

00:06:38.600 comment you'll see that when I change

00:06:40.479 the code

00:06:41.560 below we get the exact same type errors

00:06:43.919 as below

00:06:45.639 woohoo change it to a string it works

00:06:48.080 again the really great part is that we

00:06:50.319 don't need to extend TIG and we don't

00:06:52.400 even need to use the survey runtime gem

00:06:54.240 we can remove the requires and we can

00:06:55.680 remove it from our gym file because

00:06:57.120 we're no longer calling methods in a

00:06:58.560 Ruby code anymore and there's an open

00:07:00.880 pull request to merge our work which

00:07:02.960 we're hoping to have landed really soon

00:07:04.880 we're already sort of experimenting

00:07:06.000 internally with at

00:07:07.400 Shopify so what was so hard about this

00:07:09.919 what warrants making a conference talk

00:07:11.599 about the work that we've done here to

00:07:13.680 explain that I'm going to have to give a

00:07:15.360 really brief overview of the history of

00:07:17.199 of RBS and how it was sort of

00:07:20.120 developed before 2022 Sorbet was a sorry

00:07:24.720 not Sorbet RBS rbs was a pure Ruby gem

00:07:27.680 that had internal parser that was

00:07:29.120 written in Ruby that was accessed via a

00:07:32.160 regular old Ruby API you would call the

00:07:33.759 parse uh method on RBS

00:07:36.199 parser in

00:07:38.280 2022 Sutaro rewrote the internal

00:07:41.199 implementation of this parser to use a C

00:07:43.599 extension so now you still have the same

00:07:45.360 Ruby API but now implemented with a C

00:07:47.520 extension in the background

00:07:49.759 uh of critical importance is that this

00:07:52.080 CC code still makes very heavy use of

00:07:54.560 Ruby VM features from the C extension

00:07:56.560 API and that's something that we're

00:07:58.240 going to need to address and I'll get

00:07:59.360 into

00:08:00.120 why our work was to take this a step

00:08:02.800 further and make it so that this is a

00:08:04.879 pure C parser that does not use the Ruby

00:08:06.960 extension APIs at all so while the Ruby

00:08:10.639 VM still exists like it sorry the Ruby

00:08:12.720 API exists like it was before and Ruby

00:08:14.800 apps can call it we now have this new C

00:08:19.160 API so let's look at what this achieves

00:08:23.199 existing users of RBS such as Ruby LSP

00:08:26.000 which is the tool that powers the rich

00:08:27.520 edit editor integrations into your IDE

00:08:30.160 and type checkers like Steep can

00:08:32.080 continue to use RBS as if it were a

00:08:34.240 regular Ruby gem and they'll just call

00:08:35.839 regular method uh Ruby methods but what

00:08:38.880 we've unlocked is for tools that don't

00:08:40.560 want to necessarily call Ruby code to be

00:08:42.640 able to make use of RBS as a library and

00:08:44.959 so for example we'll have a new type

00:08:46.959 checker or a different type checker like

00:08:48.399 Sorbet which is written in C++ and even

00:08:51.760 alternative Ruby runtimes like J Ruby

00:08:53.680 can make use of this and they can just

00:08:56.240 make calls to the C code and the J Ruby

00:08:59.600 is uh example is a particularly good one

00:09:01.680 where J Ruby can handle Ruby gems that

00:09:04.640 just have Ruby code it can also make

00:09:06.320 foreign function interface calls into C

00:09:08.160 libraries but Ruby extensions uh sorry

00:09:11.120 Ruby uh gems with C extensions are this

00:09:13.680 middle ground that it can't support and

00:09:15.440 we fixed that for this context

00:09:19.360 so why was the Ruby dependency a problem

00:09:22.320 well the first problem was that using

00:09:24.160 any of these Ruby features of the Ruby

00:09:25.600 VM requires holding the global VM lock

00:09:28.240 when you do that you limit your CPU

00:09:30.080 parallelism to just a single thread

00:09:31.600 unless you start to incorporate ractors

00:09:34.000 and this was going to be a really big

00:09:35.440 issue for us because one of the ways

00:09:37.120 that survey is able to be so fast is

00:09:38.480 that it can parse all your files in

00:09:40.000 parallel on multiple threads within the

00:09:42.240 same process if we're using RBS as a

00:09:45.040 Ruby gem the JVI would just limit the

00:09:47.440 the parallelism of that down to just a

00:09:49.120 single core doing real work at a

00:09:51.560 time another thing to note is that like

00:09:53.760 I mentioned earlier MRI's C extension

00:09:55.519 API is not portable and would preclude

00:09:57.920 alternative Ruby runtimes from be able

00:09:59.519 to use the API and lastly there's also a

00:10:02.480 peripheral benefit around performance

00:10:04.480 where with CC code we have a bit more

00:10:05.920 control over the memory layout of things

00:10:07.680 and we can tune things to use less

00:10:09.440 memory and less CPU

00:10:12.320 so if there's all these benefits one

00:10:14.079 might ask you know why was RBS using

00:10:17.040 Ruby in the first place inside of its

00:10:19.120 rewritten C parser why didn't it just

00:10:22.079 use pure C code from the beginning well

00:10:24.320 the reason is that the Ruby VM is really

00:10:25.839 convenient it gives us a lot of really

00:10:27.360 great features that we're going to have

00:10:29.200 to give up and

00:10:30.440 replace the first of these is air

00:10:32.560 handling via exceptions c doesn't have

00:10:34.720 exceptions so we're just going to need

00:10:36.399 to do something else the next is memory

00:10:39.440 management so the Ruby VM provides a

00:10:41.920 dynamic memory allocator and a garbage

00:10:43.760 collector that will take care of freeing

00:10:45.680 memory for you when you're done using it

00:10:47.920 and so we're going to need to do

00:10:48.880 something about that and there's two

00:10:50.720 more points that I'll mention that I

00:10:51.760 won't elaborate as much the first that

00:10:54.079 you might be familiar with is that the

00:10:55.600 Ruby standard library provides us with a

00:10:57.040 lot of really convenient classes and

00:10:58.399 data structures you can think of array

00:11:00.000 hash there's cues and so on so we're

00:11:02.480 going to need to find replacements for

00:11:04.079 those one suggestion that I'll make here

00:11:06.240 is that we should resist the temptation

00:11:08.240 to just hand roll all our own data

00:11:09.760 structures i would really recommend

00:11:11.360 trying to use well- tested high

00:11:13.200 performance container libraries um that

00:11:15.519 exist for CC code and lastly I'll

00:11:18.160 mention that Ruby as a object-oriented

00:11:20.399 programming language gives us

00:11:21.839 polymorphism we can have methods that

00:11:23.519 are dynamically dispatched so that when

00:11:25.360 we send a message to an object the Ruby

00:11:27.440 VM will automatically figure out for us

00:11:29.200 what's the class of this object and

00:11:30.560 what's the correct implementation of

00:11:32.560 that message for that

00:11:34.360 object so let's dive into the first two

00:11:36.720 of those in detail the first Ruby

00:11:38.800 convenience is error

00:11:41.240 handling so edmatic Ruby code handles

00:11:44.000 errors via exceptions and this is really

00:11:46.880 great when we raise an exception it's

00:11:48.560 like a kind of deep return like a return

00:11:50.880 statement we're going to stop the

00:11:52.240 execution of a current function but

00:11:54.240 unlike a return statement it has this

00:11:55.600 deep quality where we can actually jump

00:11:56.959 multiple layers of the call stack if if

00:11:59.600 we will interrupt the parent function

00:12:01.519 and the parents parent function and so

00:12:03.200 on all the way up until we reach the

00:12:04.959 most recent rescue block it's like a

00:12:06.720 kind of teleportation that we can do the

00:12:09.200 Ruby C API exposes these with APIs like

00:12:11.680 RB rays and RB rescue so let's see an

00:12:14.079 example of how that might have been used

00:12:15.279 and what we'll do instead here we have a

00:12:18.800 a fictional function called parse

00:12:20.720 optional tupil a tupil in RBS is just a

00:12:23.360 fixed length array of potentially

00:12:25.120 different types so for example you might

00:12:27.040 have a tupil of a string an integer and

00:12:28.639 a bool that could be packed together an

00:12:30.720 optional tupil is just a type that

00:12:32.320 describes either those things or nil so

00:12:34.480 you could have either value uh be passed

00:12:37.040 to a parameter of this type so we'll see

00:12:39.519 that to parse an optional tupil well

00:12:41.120 first we're going to parse a tupil and

00:12:42.800 then we're going to see is it followed

00:12:44.000 by a question mark if it is then we have

00:12:45.760 an optional tupil if it's not we have a

00:12:48.240 just a regular tupil looking at the

00:12:50.720 implementation of parse tupil we'll see

00:12:52.480 that one of the first things we're going

00:12:53.519 to want to do is assert that we have an

00:12:55.920 opening square brace as the first

00:12:57.279 character if we don't then this isn't a

00:12:59.279 tupil and there's just some kind of

00:13:00.560 syntactic error and so this RB rays over

00:13:03.279 here that we do that raise a syntax

00:13:04.800 error well that's a Ruby feature and we

00:13:06.720 won't be able to use that so how do we

00:13:09.920 throw exceptions without Ruby that's the

00:13:13.360 cool part uh you don't we're just going

00:13:15.360 to need to figure out something

00:13:16.079 completely different here so we know

00:13:18.399 that RB race has got to go we can't keep

00:13:20.800 that what we'll do instead is we'll

00:13:22.399 return false like I mentioned you could

00:13:24.560 think of raising exceptions as a kind of

00:13:26.399 deep return so we could just return a

00:13:28.880 bunch ourselves so we're going to return

00:13:31.600 false in this case and false means that

00:13:33.440 an operation failed likewise if we get

00:13:36.000 to the end of this function and we

00:13:37.360 succeed we're going to want to return

00:13:38.880 true and true says that the operation

00:13:41.079 succeeded but to do this to be able to

00:13:43.279 return booleans where previously we

00:13:44.639 would be returning these RBS nodes we're

00:13:47.040 going to have to change the signatures

00:13:49.279 which is going to be a little

00:13:50.000 inconvenient but we can go through

00:13:51.880 it so we'll take the RBS node pointer

00:13:54.880 return types and we're going to actually

00:13:56.560 move them into the parameter list as out

00:13:59.040 parameters and the actual return types

00:14:01.360 are going to be bool what this means is

00:14:03.199 that these functions are going to return

00:14:04.160 a value not by actually returning it but

00:14:06.240 by storing it into a location of your

00:14:07.839 choosing when we do this we'll no longer

00:14:10.560 be able to call parse tupil in the way

00:14:11.920 that we do near the top here because its

00:14:13.279 return value is just going to be a

00:14:14.160 boolean not the the node that we're

00:14:15.839 looking for so we're going to move it

00:14:17.279 onto its own line and we're going to

00:14:19.279 pass to it the local variable into which

00:14:21.360 we want pars tupil to store its result

00:14:24.160 but now we have to ask the question did

00:14:25.680 this succeed and to know that we have to

00:14:28.160 check the boolean that was returned uh

00:14:30.000 when this function was invoked so we'll

00:14:32.560 wrap this in an if statement and we'll

00:14:34.720 say if not parse tupil so if the result

00:14:36.880 is false then we know that we have an

00:14:39.279 air condition and what do we want to do

00:14:40.720 in this case well we're going to return

00:14:41.920 false which is a sort of a form of

00:14:44.000 manually bubbling up an air so a key

00:14:47.519 note here is that air propagation is

00:14:49.760 manual and it's kind of inconvenient but

00:14:51.839 that's just what we're going to have to

00:14:52.800 do

00:14:53.480 here now to fill in a few more details

00:14:55.839 you'll notice that well we can only

00:14:57.760 return booleans from these functions so

00:14:59.360 this long return value here is going to

00:15:01.440 be an issue so what are we going to do

00:15:03.600 we're going to assign it to the our our

00:15:05.440 out parameters going to move over that

00:15:08.160 expression and we're going to return

00:15:09.680 true for success at the end

00:15:11.480 here and so that's an example of how the

00:15:13.839 error management in one of these

00:15:15.279 libraries might look like next I'm going

00:15:18.320 to talk about the second convenience

00:15:19.440 that Ruby provides us which is memory

00:15:20.880 management and I'll talk about it

00:15:22.720 through an example so suppose we have a

00:15:24.880 foo which is some kind of syntactic

00:15:26.560 construct that

00:15:27.800 involves to parse a foo which involves

00:15:30.240 parsing a then parsing B then parsing C

00:15:32.880 these are just all examples now if we

00:15:35.839 get midway through the process and we

00:15:37.279 encounter syntax error there's no reason

00:15:39.519 for us to keep trying to parse the whole

00:15:41.120 rest of it we want to bail early so

00:15:43.360 let's add some early guard clauses that

00:15:45.279 will return if any one of these steps

00:15:47.880 fail and let's briefly walk through the

00:15:50.320 happy path of what would happen when we

00:15:52.160 call this code to do so I'll bring up a

00:15:54.639 little representation of a runtime stack

00:15:56.240 on the right here and we're going to

00:15:57.440 step through it like a debugger so when

00:15:59.600 we enter our function we have an

00:16:01.120 execution pointer that's going to look

00:16:02.240 at the start of our first line and we

00:16:04.720 have a call stack or a stack frame

00:16:06.079 allocated on our call stack which has

00:16:07.680 uninitialized variables with local for

00:16:10.240 local variables called A B and

00:16:13.079 C when we call parse A internally it's

00:16:16.079 going to like parse our text it's going

00:16:17.759 to allocate some result object A it's

00:16:19.680 going to return the pointer to it and

00:16:21.519 we're going to assign that assign that

00:16:22.800 pointer into our local variable A

00:16:25.040 similarly we'll call some function parse

00:16:26.959 B it's going to parse some stuff capture

00:16:29.120 that data in a B object and we'll point

00:16:30.800 to it lastly we'll do the same for an

00:16:32.800 object

00:16:33.720 C so all this parsing succeeded and what

00:16:36.639 we want to do is package up this data

00:16:38.600 together so at this point ownership over

00:16:41.600 these objects belongs to this the stack

00:16:43.600 frame of our current function call but

00:16:46.000 when we create a new foo object we're

00:16:48.560 it's going to have pointers to those

00:16:49.920 three objects and we return when we

00:16:52.240 return the fu object out of our current

00:16:54.920 um function call our stack frame is

00:16:57.680 going to disappear and with it the

00:16:59.519 pointers that we had to those objects

00:17:02.079 this all works great whoever is going to

00:17:04.079 own the foo object after this function

00:17:05.600 call is going to be responsible for not

00:17:07.199 just freeing the foo but also freeing

00:17:09.600 its constituent a b and

00:17:11.400 c but now let's see how this can go

00:17:14.240 wrong if you have an error again we're

00:17:16.240 going to enter our function allocate a

00:17:17.919 stack frame and we're going to step

00:17:19.360 through it suppose parsing the A

00:17:22.000 succeeds we're going to get some A

00:17:23.360 object on the heap and we're going to

00:17:25.039 point to it with our A local variable

00:17:27.039 but now let's see let's say that there

00:17:28.480 is a character that's out of place for

00:17:30.160 our B value that we don't even allocate

00:17:33.200 it because we just can't we don't know

00:17:35.840 what the syntax is that we're trying to

00:17:37.200 parse there it's going to return null

00:17:39.600 we're going to store that in a local

00:17:40.640 variable and on the the next line on the

00:17:42.880 if statement we're going to decide to

00:17:44.080 return early at this point our stack

00:17:46.720 frame is destroyed the local variable A

00:17:49.200 stops existing and our object A is still

00:17:52.559 there we haven't freed it so this object

00:17:55.840 exists on the heap but there's no way

00:17:57.440 for us to access it anymore not either

00:17:59.679 to use it or to get rid of it it's

00:18:01.440 almost like a helium balloon if you hold

00:18:03.039 onto the string of a helium balloon it's

00:18:04.640 yours you can use it you can have fun

00:18:06.480 and you can throw it out but if you let

00:18:07.919 go of the handle of your helium balloon

00:18:09.200 and you have no other strings attached

00:18:10.559 to it it's going to fly away and you

00:18:12.400 lose control over it this is what we

00:18:15.280 would call memory leak and so a has

00:18:17.720 leaked now if you just leak one object

00:18:20.160 occasionally it's not great but it's not

00:18:22.400 the worst but it's interesting to note

00:18:24.320 how this works in the case of RBS so you

00:18:27.760 might just start writing some partial

00:18:29.679 RBS code because uh you know your code

00:18:32.640 isn't always syntactically valid as

00:18:34.160 you're typing it you're you're starting

00:18:35.520 on the left and you're typing character

00:18:36.799 by character and almost every time that

00:18:38.640 your ID is going to parse this code it's

00:18:40.240 not going to be valid so we might start

00:18:41.840 with an empty comment then we might open

00:18:43.760 a brace but that's that that just leaked

00:18:46.080 some objects and then we're going to say

00:18:47.280 well there's a parameter x and it's an

00:18:48.799 integer and we're leaking objects and

00:18:50.640 there's also y and oh god there's a lot

00:18:52.559 of objects and it's going to return void

00:18:54.960 and there's more objects and eventually

00:18:57.440 your IDE will get uh killed by the

00:18:59.520 operating system because you use too

00:19:00.720 much memory if your leaks are too bad we

00:19:03.600 don't want this so how do we fix it well

00:19:06.559 the manual way to do it is going to need

00:19:08.400 a bit more room so I'll move this code

00:19:09.919 over and I'll add some more lines for us

00:19:11.840 to use

00:19:13.280 and we are going to need to add these

00:19:14.880 free calls so let's see how they help if

00:19:17.679 we just look at the first segment of

00:19:18.799 this where we parse an A we don't have

00:19:21.919 anything to clean up if parsing A fails

00:19:24.640 then there's no other objects we need to

00:19:26.160 free so we could just return null

00:19:27.440 there's nothing to do here if parsing A

00:19:30.320 succeeded and we got to parsing B and

00:19:32.640 the B failed well when the B fails we've

00:19:35.280 already successfully allocated the A and

00:19:37.520 so we're going to need to free just the

00:19:39.039 A but not the B now supposing parse B

00:19:42.320 succeeded now there's the free A

00:19:44.320 supposing parse B succeeded and we moved

00:19:46.080 on to parsing the C well at this point

00:19:48.559 the A and the B were allocated but not

00:19:50.400 the C so we're going to need two free

00:19:51.799 statements and so you can see here is

00:19:53.840 that the number of different free calls

00:19:55.360 that you have to add scales pretty

00:19:57.360 unfortunately with the number of exit

00:19:58.960 points and local variables that you have

00:20:00.799 in your function and because we use

00:20:02.880 early returns as a way of modeling

00:20:04.640 errors that introduces even more areas

00:20:06.880 where we early return and need to do

00:20:09.520 manual cleanup and so not only is this

00:20:11.440 quite unpleasant to program but it's

00:20:12.880 really errorprone and you risk either

00:20:14.720 forgetting to free a resource which

00:20:16.080 means you have memory leaks and

00:20:17.280 potentially eventually out of memory

00:20:19.080 crashes or potentially you double free

00:20:21.760 the same object you free it twice and

00:20:24.160 you have undefined behavior at best you

00:20:26.880 might have a crash immediately at worst

00:20:28.559 you have you might have suddenly wrong

00:20:30.000 behavior in your library and it's going

00:20:31.440 to be really hard to

00:20:33.080 debug so to fix this let's talk about

00:20:35.360 memory allocators now the CS standard

00:20:38.000 library provides a memory allocator that

00:20:39.600 you must you're probably quite familiar

00:20:41.039 with it's exposed through maloc and free

00:20:44.559 maloc has a deceptively appealing and

00:20:47.039 simple looking interface you give it a

00:20:49.679 number of bytes of memory that you want

00:20:51.360 and it returns back either null if

00:20:53.360 there's no memory or a pointer to a

00:20:55.600 chunk of memory with at least that many

00:20:57.280 bytes available for you to use you use

00:20:59.760 it for as long as you'd like and when

00:21:01.200 you're finished with that memory you

00:21:02.400 pass that same pointer to free and

00:21:04.080 you're done with it

00:21:06.320 now the problem here is that object

00:21:07.840 lifetimes are just too granular like you

00:21:09.679 saw in the previous example we have all

00:21:11.440 these sort of related allocated objects

00:21:13.520 that are sort of spawned around the same

00:21:14.960 time but we have to deal with them on

00:21:16.480 this really fine grained way which is

00:21:18.159 errorprone and it also has a performance

00:21:20.320 cost because we're making so many

00:21:21.760 separate calls to free and if there's

00:21:24.000 one takeaway that I can drive here is

00:21:25.520 that even though maloc and free is

00:21:27.280 universal across the C standard library

00:21:29.760 and it's so approachable it actually

00:21:31.919 makes it really difficult to make um

00:21:34.400 sort of uh leak free and robust code so

00:21:38.559 the core idea that we want instead is to

00:21:40.240 think what if we group objects of

00:21:42.240 similar lifetimes together we want

00:21:44.640 something that lets us allocate objects

00:21:46.320 at similar but different times but then

00:21:49.440 we can free them all at once so a really

00:21:51.679 simple example of this is uh if you have

00:21:53.520 a Super Mario World as you start world

00:21:56.559 one you'll go through some objects will

00:21:58.880 get allocated for the glmbbas and the

00:22:00.559 shells and the level and the Mario

00:22:02.559 himself but when you go down the exit

00:22:04.320 pipe everything for that level can be

00:22:05.919 deleted you don't have to worry about is

00:22:07.360 it the first gloomba or the third one

00:22:09.280 everything can go at the same time

00:22:11.760 similarly in our RBS case we allocate a

00:22:14.080 new RBS parser for every single

00:22:16.159 signature that we parse and whether we

00:22:18.400 succeed or fail we could just clear the

00:22:20.320 whole thing at the very end and there's

00:22:21.919 nothing to worry about so that's an

00:22:24.480 example of how these two sort of

00:22:25.840 standard and custom allocators could

00:22:27.360 look

00:22:28.120 like on the left here you'll see these

00:22:30.320 calls to maloc where we ask it to

00:22:32.000 allocate enough memory for the size of

00:22:33.520 an integer a character and some fu t

00:22:36.679 structure it will return back the

00:22:38.640 pointers that we wanted and we can call

00:22:40.240 some do thing function you know the

00:22:41.919 proxy for the work that we want to do

00:22:43.280 with those objects at the end we'll free

00:22:45.440 each of them individually and so you see

00:22:47.440 those are quite a few function calls

00:22:48.799 internally do they could do a fair bit

00:22:50.320 of work and so there's a performance

00:22:51.600 cost to that and it's easy to miss one

00:22:53.440 or get it wrong on the right we have

00:22:56.240 similar code that does almost the same

00:22:58.640 thing but it uses this custom allocator

00:23:00.159 that we call an RBS allocator we

00:23:02.480 initialize an allocator with a

00:23:04.240 predefined amount of memory that we know

00:23:06.000 is going to be enough in this case 4

00:23:07.880 kilobyt and when we call RBS a lock we

00:23:11.280 pass in the instance of the allocator

00:23:12.960 with which we want to allocate the

00:23:14.159 memory and again it will return pointers

00:23:16.880 to us that we use almost in the same way

00:23:18.400 as maloc we do something with them but

00:23:20.960 crucially when we finish with them we

00:23:22.640 don't free those pointers individually

00:23:24.400 instead we could just wipe away the

00:23:25.919 whole allocator in one shot and so you

00:23:28.159 can imagine you have an array of

00:23:29.200 thousands of objects and rather than

00:23:30.880 looping through the array and freeing

00:23:32.000 each one independently just drop the

00:23:33.840 whole thing so let's visualize what

00:23:36.320 happens inside one of these allocators

00:23:37.840 you'll hear the term arena allocation or

00:23:39.600 also perhaps slab allocation it's a

00:23:41.520 similar idea but it starts with getting

00:23:44.080 this big piece of memory and having a

00:23:46.080 pointer that just keeps track of where

00:23:47.440 your most free recent free slot is every

00:23:50.559 time your an allocation is requested

00:23:52.640 you're just going to put it where the

00:23:54.159 free spot was and move the free spot

00:23:55.760 along and so we might allocate some

00:23:57.600 objects A X B and they're all going to

00:24:00.960 live in here now for our allocator we

00:24:04.400 chose to make it really simple because

00:24:05.760 we know that RBS parsers are very

00:24:07.360 shortlived they're allocated and dropped

00:24:10.159 within like a millisecond so we don't

00:24:12.640 have to worry about doing any

00:24:13.679 complicated work about freeing objects

00:24:15.840 and finding holes to allocate them in so

00:24:17.919 if somebody stopped finishing stopped

00:24:19.760 using C or sorry X we could just leave

00:24:23.039 it as it is that will be unused memory

00:24:24.880 it's going to be wasted for the lifetime

00:24:26.159 of the allocator but because this one

00:24:27.760 isn't going to live so long we don't

00:24:28.960 have to worry about it now when we

00:24:30.640 allocate C you see that we don't

00:24:32.320 actually fill C where X was we'll just

00:24:34.000 keep allocating at the end it's simpler

00:24:35.600 that way so we can keep on allocating

00:24:38.240 here we have some object Y maybe we

00:24:40.000 finish with Y here's our foo maybe we'll

00:24:43.120 have some other objects to allocate and

00:24:44.559 so

00:24:45.400 on now a cool thing here is that these

00:24:47.840 this allocator lets us have objects that

00:24:50.480 point amongst each other in any which

00:24:52.159 way so like before our fu can own an A a

00:24:54.880 B and a C and these pointers can have

00:24:57.600 cycles they can be transitive like we

00:24:59.760 don't have to care the really cool part

00:25:01.600 is that when we're done with it we could

00:25:03.840 just drop the whole thing we don't have

00:25:06.000 to chase the graph make sure we don't

00:25:07.919 get caught in loops we just wipe away

00:25:09.799 everything there's a risk to it though

00:25:11.840 if I bring revive back that

00:25:14.520 memory i showed you here pointers that

00:25:17.039 are within uh between objects within the

00:25:19.760 same arena but if you have objects from

00:25:22.159 one arena pointing to objects in another

00:25:23.760 arena or for example if you have local

00:25:25.760 variables or global variables pointing

00:25:27.200 into an arena you'll have these external

00:25:29.279 references from outside that point into

00:25:30.880 it and if you drop your arena at the

00:25:33.919 wrong time without cleaning those

00:25:37.240 up you get dangling pointers any attempt

00:25:40.240 to read or write from these memory

00:25:41.679 locations is going to blow up again in

00:25:44.240 the best case it's going to crash your

00:25:45.440 program immediately in the worst case

00:25:47.039 you're going to have really subtly

00:25:48.080 incorrect behavior it's going to be hard

00:25:49.440 to

00:25:50.360 debug now let's look at how this lets us

00:25:52.640 write the code that we wanted we have a

00:25:54.640 similar example to before where we have

00:25:56.960 all these early returns but within the

00:25:59.360 parse A B and C functions we'll note

00:26:01.600 that everything is allocated owned by a

00:26:03.919 particular allocator for for this

00:26:05.440 current parser and because that parser

00:26:08.240 owns all those objects and they'll all

00:26:10.400 be freed at the end whether we succeed

00:26:11.919 or we fail we don't have to worry about

00:26:13.760 any manual cleanups on the early return

00:26:15.520 paths here and so this is much easier to

00:26:18.799 get right there's nothing to do

00:26:21.440 there's another downside that I'll

00:26:22.760 mention which is that when you have

00:26:25.679 functions that need to allocate they

00:26:27.279 need an instance of the allocator to do

00:26:28.880 to allocate with and so here's some

00:26:30.720 example functions we have a RBS hash new

00:26:32.960 RBS list new some function to unquote a

00:26:35.440 string and descript space off a string

00:26:37.840 well for them to be able to allocate

00:26:39.400 internally we're going to have to add a

00:26:41.200 new parameter and plumbing these around

00:26:42.960 can be kind of annoying basically almost

00:26:44.640 every function in your system is going

00:26:45.840 to take one of these and almost every

00:26:47.520 function call is going to involve

00:26:48.559 passing it it's a little tedious but

00:26:51.200 there's a kind of a cool upside which is

00:26:52.720 that at a glance I can tell you that RBS

00:26:54.799 string strip whites space doesn't

00:26:56.960 allocate i can tell you that it's not

00:26:58.960 going to return me a copied string of

00:27:01.039 the the whites space trimmed off it's

00:27:02.880 going to give me a string that points to

00:27:04.559 the same memory with the the start and

00:27:07.360 the end chopped off without doing any

00:27:09.440 memory allocation or copying that's

00:27:11.440 pretty

00:27:13.159 awesome so in summary what's the big

00:27:15.600 deal about extracting a C library from a

00:27:17.600 Ruby gem why would I want to do it well

00:27:19.600 pure C libraries are just more portable

00:27:22.000 than Ruby gems that contain C

00:27:24.919 extensions because we're not calling

00:27:26.720 Ruby anymore we don't have to worry

00:27:28.080 about the global VM lock and we could

00:27:30.000 just use any kind of threading that we

00:27:31.440 want without being limited by

00:27:33.320 it i would recommend that you establish

00:27:35.440 a consistent pattern around air handling

00:27:37.360 early on in the process like I showed

00:27:39.360 it's going to mean that for example your

00:27:40.559 functions turn booleans and that's going

00:27:42.400 to impact your API design and it's

00:27:44.080 pretty hard to retrofit that once you've

00:27:45.760 already built up a big system

00:27:48.080 similarly I suggest you think about

00:27:49.440 lifetimes really early in your design

00:27:51.039 and implementation because you're going

00:27:53.760 to need to pass around allocators and it

00:27:55.200 can be really frustrating to decide oh

00:27:57.120 actually I need an allocation here but

00:27:58.799 this function doesn't have an allocator

00:28:00.000 and the function that calls it doesn't

00:28:01.360 have an allocator and now you have to go

00:28:03.120 get your plumbing tools and start wiring

00:28:04.720 allocators through everything that could

00:28:06.320 be quite

00:28:07.320 difficult and again I recommend that you

00:28:09.919 embrace popular libraries for fast and

00:28:11.600 well tested containers there's a bunch

00:28:13.279 of those for C and uh try and make use

00:28:16.480 of those where you can to learn more

00:28:18.720 about the benefits of RBS and what you

00:28:21.039 can do with it i would recommend you

00:28:22.640 join my colleague Alexandra tomorrow in

00:28:24.559 this same hall at 14:00 for his talk

00:28:27.120 inline RBS comments for seamless typeing

00:28:29.120 with Sorbet in which you can learn more

00:28:31.200 about how Shopify is adopting RBS in our

00:28:33.520 code in our tooling and lastly I would

00:28:36.640 really like to thank Alexandra Stan Low

00:28:38.960 and Sutaro for their work in RBS and for

00:28:41.200 the great success we've had so far thank

00:28:43.120 you so much that's all folks

Alexander Momchilov

@amomchilov

Explore all talks recorded at RubyKaigi 2025

+66

RubyKaigi 2025