Summarized using AI

Optimizing JRuby 10

Charles Nutter • April 18, 2025 • Matsuyama, Ehime, Japan • Talk

In the talk "Optimizing JRuby 10" presented by Charles Nutter at RubyKaigi 2025, the speaker discusses the significant enhancements made in JRuby 10, which now supports Ruby 3.4 and Rails 8 compatibility. After years of feature catch-up, this release focuses on optimizations tailored to improve performance in JRuby applications.

Key points discussed include:

- JRuby 10 Release: Announcement of the JRuby 10 release, highlighting the achievement of Ruby 3.4 compatibility and the requirement for Java 21.

- Performance Optimizations: Several performance enhancements were made, particularly in startup time, warm-up time, and overall application performance across multiple cores.

- Startup Time Improvements: Techniques such as application class data sharing (App CDS) and machine code caching (Project Loom) were introduced to accelerate application startup times.

- Benchmarking Performance: The speaker compares the performance of JRuby against C Ruby, showcasing JRuby's ability to run benchmarks faster, particularly with certain data types and JSON parsing tasks.

- Concurrency Enhancements: The introduction of project Loom and lightweight virtual threads improve JRuby's handling of concurrency, enabling more efficient operations compared to traditional threads and ractors in C Ruby.

- Future Development: Charles emphasizes the need for community involvement and contributions to address ongoing challenges like optimizing keyword arguments and integrating future JDK features into JRuby.

He concludes by inviting developers to try JRuby, offer feedback, and collaborate on optimizing their Ruby applications for better performance in enterprise settings, ensuring that JRuby can effectively support high-performance Ruby applications across diverse environments, including Android and enterprise-level software.

Overall, the talk reveals the substantial advancements in JRuby's compatibility and performance, positioning it as a robust alternative for Ruby developers seeking Java platform capabilities.

Optimizing JRuby 10
Charles Nutter • Matsuyama, Ehime, Japan • Talk

Date: April 18, 2025
Published: May 27, 2025
Announced: unknown

JRuby 10 is out now with Ruby 3.4 and Rails 8 compatibility! After years of catching up on features, we've finally been able to spend time on long-delayed optimizations. This talk will show some of the best examples, including real-world application performance, and teach you how to find and fix performance problems in your JRuby applications.

https://rubykaigi.org/2025/presentations/headius

RubyKaigi 2025

00:00:08.840 okay i'm excited that you're all here uh
00:00:11.519 i'm going to get right into it we got a
00:00:12.880 lot of content to get to uh talk about j
00:00:15.440 ruby 10 and some of the cool
00:00:16.960 optimizations that we're working on
00:00:19.560 uh thank you first of all uh for my
00:00:23.359 friend mufan i want to promote the ruby
00:00:25.760 conf taiwan 2025 combined with uh kuskup
00:00:29.039 the open-source co conference in taiwan
00:00:32.719 uh so if you want to know more about it
00:00:34.480 just find this guy he will he will tell
00:00:37.520 you everything about it and hopefully
00:00:38.960 i'll be able to see you
00:00:40.600 there okay ah konichiwa i'm very excited
00:00:44.480 to be back here and present j ruby 10
00:00:46.640 today uh there's my basic contact
00:00:48.960 information i have been developing uh
00:00:51.440 and maintaining j ruby for 20 years now
00:00:54.399 uh trying to bring the best of the java
00:00:56.960 platform to the ruby world uh and i it
00:01:00.399 says unemployed here last year uh i was
00:01:03.600 working for red hat and they were
00:01:05.040 sponsoring the project but after 12
00:01:06.960 years they decided to move on from that
00:01:09.680 uh so really right now i just work for
00:01:12.880 the ruby community i am continuing to do
00:01:15.439 j ruby development full-time uh funding
00:01:18.560 the project through sponsorships from
00:01:20.799 users and through support contracts for
00:01:24.159 folks that have critical production
00:01:26.400 applications running on j ruby uh and so
00:01:29.280 far we are managing to maintain
00:01:31.600 development uh maintain me as one
00:01:33.759 developer uh but looking for new
00:01:36.079 opportunities to partner with companies
00:01:38.079 out there so please uh let me know if
00:01:41.360 you think you can help the project or if
00:01:43.119 you need uh a little extra help uh
00:01:45.360 deploying or profiling code so the big
00:01:48.799 announcement this week uh we finally
00:01:50.720 released j ruby 10 this is our biggest
00:01:53.360 release in many many years uh we have
00:01:56.000 jumped to ruby 3.4 compatibility uh
00:01:59.280 running lots of new tests and specs
00:02:01.600 thousands of new assertions we've done a
00:02:03.920 lot of work to try and make sure
00:02:05.200 compatibility is solid we also now
00:02:07.840 require the minimum of uh java 21
00:02:11.120 because of all the cool new features in
00:02:12.879 the jvm that we want to start taking
00:02:14.480 advantage of that we when we run on an
00:02:16.800 old version it was just too hard for us
00:02:18.400 to utilize them uh so lots of additional
00:02:21.200 performance work coming up this year so
00:02:24.000 the talk is optimizing j ruby uh but
00:02:26.800 that means a lot of different things and
00:02:29.040 different things to different people of
00:02:30.800 course as rubists uh we know ruby is
00:02:33.280 optimized for developer happiness and j
00:02:36.400 ruby is just another ruby we want it to
00:02:38.640 feel and work just like the ruby you use
00:02:41.440 every day the standard c ruby uh but
00:02:44.480 there's other ways that we can optimize
00:02:47.040 uh j ruby in the community we can
00:02:49.360 optimize by giving you more
00:02:50.959 opportunities to use ruby in more places
00:02:53.760 we have compatibility work that we
00:02:56.080 continue to do and try to speed up our
00:02:58.160 our adoption of new features uh we've
00:03:00.959 been working hard on improving startup
00:03:02.959 and warm-up time which are traditionally
00:03:04.959 hard things on the jvm uh we've also
00:03:07.599 worked on various kinds of direct
00:03:09.200 performance straight line performance
00:03:10.879 multi-core and i'm going to go through
00:03:12.640 each of these areas a little bit and
00:03:14.480 show you what we're doing in j ruby so
00:03:17.040 compatibility wise uh tracking ruby
00:03:21.120 updates ruby versions is a very
00:03:23.480 challenging project it's the number one
00:03:26.159 thing that we have to work on all the
00:03:27.840 time maintaining compatibility fixing
00:03:30.480 compatibility issues with regular ruby
00:03:32.879 uh the big leap that we've made with j
00:03:35.040 ruby 10 to ruby 3.4 means that we're
00:03:38.239 finally caught up again uh if we look at
00:03:40.720 what this is on a a chart here uh c ruby
00:03:44.080 versions are the blue line there
00:03:46.480 increasing pretty much linearly except
00:03:48.640 for example the jump from 2.7 to 3.0 and
00:03:52.000 you can see j ruby varies a little bit
00:03:54.159 we fall a little bit behind we miss a
00:03:56.400 release or two and then we catch up
00:03:58.480 again and then we fall a little bit
00:04:00.159 behind and we catch up again that's
00:04:02.159 because of the work required to comp to
00:04:05.040 maintain compatibility with each release
00:04:08.080 uh means we won't be able to do
00:04:09.680 optimization we won't be able to support
00:04:11.439 users so sometimes we have to balance
00:04:13.760 those things another way to look at this
00:04:16.799 uh how many days of lag time it's taken
00:04:19.440 j ruby versions to catch up with c ruby
00:04:22.079 versions uh so here are along the bottom
00:04:24.720 the versions of ruby that j ruby has
00:04:26.800 supported 2.2 2.3 2.5 2.6 3.1 and now
00:04:32.759 3.4 and you can see the 3.4 block is the
00:04:36.479 smallest one this is the closest we have
00:04:39.040 ever been to being right on schedule
00:04:41.600 with supporting the current version of
00:04:43.520 ruby and actually we probably could have
00:04:46.000 released j ruby 10 in december but we
00:04:48.880 wanted to complete some stabilization
00:04:51.919 make sure there were as many tests
00:04:53.280 running green as possible so we're very
00:04:55.759 excited to be caught up with c ruby
00:04:58.280 again uh and this work is all done by
00:05:01.520 the j ruby team of course with some
00:05:03.680 external contributors uh if we look at
00:05:06.479 contributors to c ruby who have done
00:05:08.720 more than 20 commits in the last year we
00:05:11.360 have a long list of developers of course
00:05:14.320 patch monster nou is right up at the top
00:05:16.800 as always uh but lots of other folks
00:05:19.360 from shopify and from other companies
00:05:21.919 all contributing to c ruby on the j ruby
00:05:24.960 side it's a little different it's
00:05:27.759 largely me and then tom does a lot of
00:05:30.479 work on the parser he does a lot of work
00:05:32.400 with users and supporting issues uh and
00:05:35.039 actually the number three committer on
00:05:36.800 the j ruby project is my son who's in
00:05:39.280 the audience who did uh all of the work
00:05:41.680 on the new j ruby launcher and has
00:05:43.759 helped us out with some of the c and
00:05:45.520 native code that we have for j ruby as
00:05:47.759 well obviously we need more help here
00:05:51.280 more folks that can try things out help
00:05:53.440 us implement new features you can
00:05:55.199 implement them in ruby if you want we're
00:05:56.960 fine with that uh but keeping up with
00:05:59.039 ruby is a big challenge for us because
00:06:01.120 it's not a very big team so what about
00:06:04.240 rails 8 compatibility well this is
00:06:06.560 complicated and i've answered this
00:06:08.080 question multiple times this week uh in
00:06:11.520 general everything that's pure ruby in
00:06:13.759 rails we expect to work from version to
00:06:16.240 version uh so rails 8 requires ruby 3.2
00:06:19.919 now we have j ruby 10 which supports
00:06:22.639 ruby 3.4 everything ruby based in rails
00:06:26.160 ought to work just fine on j ruby uh now
00:06:28.479 i say that most things work because then
00:06:30.639 we have something like active record
00:06:32.639 almost all of the work we have to do to
00:06:35.520 be compatible with rails is updating our
00:06:38.400 version of active record there's a lot
00:06:40.479 of special code specific to all the
00:06:42.880 database adapters there are database
00:06:45.680 specific things that we need to
00:06:47.120 translate over to the java database api
00:06:50.479 this is where most of the lag comes from
00:06:53.039 uh so i don't know how many people are
00:06:54.960 running rails and using other
00:06:56.479 persistence layers but active record is
00:06:59.120 pretty much the only thing that we ever
00:07:00.639 have to work on to keep up with
00:07:02.240 compatibility in rails and again we
00:07:04.720 would love to have some help here folks
00:07:06.560 that are familiar with the database
00:07:08.080 adapters in active record can help us
00:07:10.639 keep up on j ruby
00:07:12.520 side so startup this is one of the first
00:07:15.520 things that people run into with j ruby
00:07:18.000 uh and first impressions are important
00:07:20.000 we want j ruby to feel as close to c
00:07:23.440 ruby an as a development experience as
00:07:25.880 possible unfortunately there's a lot of
00:07:28.479 work that has to be done to get j ruby
00:07:30.560 up and running all of our code starts
00:07:32.800 out as jav code or ruby source files all
00:07:37.680 that code has to be parsed in loaded by
00:07:39.759 the jvm uh the jvm has to run it for a
00:07:43.120 while and then it will optimize it to
00:07:44.720 native code and that whole process takes
00:07:47.360 a while just to get things up and
00:07:49.720 running so we've been exploring some of
00:07:52.240 the new features coming along in the jdk
00:07:54.960 to help speed up our startup time and
00:07:57.759 later on it will also help with warm-up
00:07:59.879 time uh a little graphic example there's
00:08:03.280 a lot of steps to j ruby uh our design
00:08:06.720 of internal ir our intermediate
00:08:09.599 representation is very similar to what
00:08:11.759 we have in the zjit uh jit that's coming
00:08:14.720 for c ruby and 3.5 and beyond and the
00:08:18.160 jvm itself actually has the same sort of
00:08:21.199 compiler design so our code runs for a
00:08:24.560 while in our interpreter we feed it into
00:08:26.800 the jvm the jvm runs it for a while and
00:08:29.919 then we eventually get very fast native
00:08:32.080 code out the other end and so a lot of
00:08:34.640 the work we do is trying to shorten this
00:08:36.959 process as much as possible so on the
00:08:39.680 jvm side the exciting projects that
00:08:41.760 we've been looking at lately first off
00:08:43.919 uh application class data store called
00:08:46.080 app cds uh this is a feature that came
00:08:49.200 along around java 17 or so
00:08:52.399 it allows the jvm to preload and
00:08:55.399 pre-cache all of that parsed code all of
00:08:58.560 the metadata the methods the class data
00:09:01.360 all of that so that we can jump right in
00:09:03.920 and not have to reparse all that stuff
00:09:06.000 every time when we start up uh this is
00:09:08.320 the primary way that we've been getting
00:09:10.000 some improvements lately uh coming in
00:09:12.399 the future and available as partial as a
00:09:15.200 preview in java 24 is uh machine code
00:09:18.640 caching that's a project called project
00:09:20.480 leaden uh and they have an aot cache
00:09:23.279 feature so as maxine mentioned in her
00:09:26.320 talk ideally when we generate optimized
00:09:29.120 native code we won't have to do that
00:09:31.279 over and over again if we restart that's
00:09:33.920 exactly what this is going to do for the
00:09:35.600 jvm we will be able to run some example
00:09:38.800 code save off the optimized native code
00:09:41.760 that's generated from all of your ruby
00:09:43.680 and then hopefully jump right back in to
00:09:45.920 a fast executing runtime
00:09:48.720 there's also a more unusual approach to
00:09:50.800 this uh coordinated restore at
00:09:53.120 checkpoint or project crack uh what this
00:09:56.640 actually allows us to do is save off an
00:09:59.839 entire process its memory space its
00:10:02.000 execution everything and then quickly
00:10:04.480 restore it and jump right back to that
00:10:06.560 point in execution uh only supports
00:10:08.880 linux right now so it's fairly limited
00:10:10.880 and there are some other limitations but
00:10:12.959 this does some amazing things with
00:10:14.560 startup so let's take a look at a
00:10:16.399 comparison here we've got our c ruby on
00:10:19.279 the left optimized for developer
00:10:21.360 happiness often means optimized for
00:10:23.279 startup so it's very quick to get going
00:10:25.760 uh and then we have plain old j ruby 9.4
00:10:29.040 so significantly slower and this is a
00:10:31.440 noticeable pain for developers on
00:10:34.000 earlier versions of j ruby uh we do have
00:10:36.880 a special flag you can pash
00:10:39.480 d--dev that will reduce the amount of
00:10:41.920 optimizations we do reduce what the jvm
00:10:44.399 does and that gives us a little bit
00:10:46.360 improvement but now in j ruby 10 taking
00:10:49.360 advantage of app cds the baseline is
00:10:52.640 already almost as fast
00:10:54.440 as-dev and de-ev flag brings it down
00:10:57.920 even more so now we're starting to get
00:10:59.760 back under a second for startup time on
00:11:01.839 basic things uh we've also played with
00:11:04.720 early versions of the aot cache that's
00:11:07.120 improving things too and there are
00:11:09.120 dozens of engineers that are working on
00:11:10.959 that for jdk right now and then project
00:11:14.079 crack as you might imagine if we can
00:11:16.240 jump right back into a running process
00:11:18.640 it's very fast and if it works for your
00:11:21.279 application this is a excellent way to
00:11:23.680 get up and running immediately with j
00:11:25.680 ruby so there are solutions for startup
00:11:29.120 so let's talk about straight line
00:11:30.640 performance optimizing the performance
00:11:32.399 of how we run ruby code uh j ruby really
00:11:36.079 depends on the jvm to do most of the
00:11:37.920 heavy lifting uh as i mentioned the jvm
00:11:40.480 itself has uh ssa compiler design
00:11:43.920 similar to what's in zjit but with 30
00:11:46.720 plus years of engineers working on this
00:11:49.519 dozens of engineers across the world so
00:11:51.760 it's really really good at doing these
00:11:53.600 optimizations uh j ruby has its own ir
00:11:56.880 and its own basic block comp compilation
00:12:00.720 we feed that into the jvm trying to do a
00:12:03.360 little bit of optimization on our side
00:12:05.200 and try to explain to the jvm how to
00:12:07.440 make ruby code fast so let's look at an
00:12:10.160 example here uh pure ruby red black tree
00:12:12.959 benchmark basically builds up a tree
00:12:15.519 does some traversal does some removals
00:12:18.000 uh and then we benchmark how long it
00:12:19.440 takes to do all of that c ruby 3.4 four
00:12:22.800 able to do about 15 iterations per
00:12:24.959 second of this
00:12:26.279 benchmark now of course we love yjit and
00:12:29.360 we're very happy at speeding up ruby
00:12:31.680 this also does a great job of optimizing
00:12:34.480 on the c ruby side uh some of this comes
00:12:37.120 from more direct access to instance
00:12:39.360 variables uh reduced overhead of doing
00:12:41.920 calls uh but j ruby actually still is
00:12:45.040 faster than wjit and that's without me
00:12:47.519 looking at this benchmark for many years
00:12:50.079 so a pretty typical piece of pure ruby
00:12:52.959 code j ruby should generally run it
00:12:55.200 faster than ruby without yjit and even
00:12:58.399 ruby with
00:13:00.120 yjit now we also have an advantage here
00:13:02.959 because even if you write a native
00:13:05.200 extension for c ruby they can't optimize
00:13:08.240 through that they can't optimize it with
00:13:09.920 all the other code and so c extensions
00:13:12.560 are a hard barrier for optimizing code
00:13:14.959 in ruby and as we know there are a lot
00:13:17.040 of c extensions and we've heard many of
00:13:19.120 the jit team from shopify talk about how
00:13:21.839 difficult it is to make those optimize
00:13:23.959 well this is an example of j ruby
00:13:26.959 running the oj uh extension which is a
00:13:29.760 pure c or pure java
00:13:32.839 uh json parser uh and in every case j
00:13:37.040 ruby is able to run this extension much
00:13:39.440 faster even though it's just a plain
00:13:41.760 code port from the original cc code
00:13:44.639 we're able to inline it with ruby code
00:13:46.639 we're able to feed it to the jvm the jvm
00:13:49.360 does a better job of optimizing at
00:13:50.959 runtime than just the plain cc code can
00:13:53.360 do so actually running ported cc code
00:13:57.360 faster on the jvm because of all of
00:13:59.519 these
00:14:00.760 optimizations now fibers have become
00:14:03.040 more and more important in the ruby
00:14:04.560 community and we've had some traditional
00:14:06.720 challenges here uh fibers were not well
00:14:09.839 supported on most older jvms the only
00:14:12.880 way that we could implement fibers for j
00:14:15.040 ruby was by wrapping full native threads
00:14:18.720 and native threads have a lot of
00:14:20.399 limitations that aren't compatible with
00:14:22.880 fibers we can only spin up a couple
00:14:25.120 thousand per process generally uh they
00:14:27.600 have much bigger memory overhead they
00:14:29.920 have a lot of cost at the kernel level
00:14:31.920 to even get started and prepare all the
00:14:33.920 metadata for them uh so it was almost
00:14:36.560 impossible for us to keep up with the
00:14:38.720 number of fibers that something like
00:14:40.720 async falcon and some of the new
00:14:42.880 frameworks want to use uh luckily though
00:14:46.560 we have project loom on open jdk which
00:14:49.680 has now brought lightweight fibers to
00:14:52.160 the jvm uh in the form of what they call
00:14:55.040 virtual threads so if we look at a
00:14:57.680 little benchmark here of spinning up a
00:14:59.760 thousand fibers make sure they're all
00:15:01.680 actually started and running and then
00:15:03.920 resume them so they'll all finish uh
00:15:06.639 with our native thread version where it
00:15:09.360 was just a regular operating system
00:15:11.600 thread only getting about 49 iterations
00:15:14.160 per second on this benchmark with the
00:15:16.959 virtual thread implementation in the jvm
00:15:20.000 many times faster already and this could
00:15:22.720 be much much more i have to do some work
00:15:25.279 inside j ruby to reduce the cost of
00:15:27.360 creating a fiber and this is going to
00:15:29.360 get much much faster very likely going
00:15:32.000 to be faster than c ruby uh very soon
00:15:34.240 now
00:15:36.720 uh we also heard from aaron patterson
00:15:38.959 about optimizing class new uh when we
00:15:41.839 want to allocate a new object uh he
00:15:44.160 showed his optimization where it can
00:15:46.160 inline the allocate plus the initialize
00:15:49.120 and improve performance significantly uh
00:15:51.920 j ruby actually implemented this almost
00:15:53.680 10 years ago uh and we've taken
00:15:55.839 advantage of it because the jvm can then
00:15:58.160 inline the jvm can optimize and we can
00:16:00.880 do a much better job of allocating
00:16:02.560 objects quickly so if we look at a few
00:16:04.800 small benchmarks here doing a bunch of
00:16:07.120 object new how fast can we just allocate
00:16:09.440 a plain object we have c ruby uh 3.4
00:16:14.320 with yjit getting about 10.3 million
00:16:16.880 iterations per second or object
00:16:18.720 allocations per second uh the patch that
00:16:22.240 aaron came up with uh giving about a 30%
00:16:25.440 increase which is uh about the same as
00:16:27.519 what he showed in his talk so 13 million
00:16:30.639 object allocations per second uh but
00:16:33.199 then there's j ruby
00:16:35.320 10 so this is very much the jvm doing an
00:16:39.759 incredible job of inlining object
00:16:41.920 allocations for us making them lowcost
00:16:44.800 making them fast and all we have to do
00:16:47.360 is teach it the allocate initialize the
00:16:50.480 little dance that we have to do for for
00:16:52.639 uh object instantiation uh and actually
00:16:55.279 this may be a little easier to see if we
00:16:56.880 go to a logarithmic scale here so we'll
00:16:58.720 stay with that for a moment now i was
00:17:00.959 actually surprised how fast this was so
00:17:03.440 i took one of the jvm profilers and
00:17:05.679 hooked it up to my j ruby instance just
00:17:08.000 to make sure it is actually allocating
00:17:10.799 those fu objects they're actually being
00:17:13.280 created they're actually on the stack or
00:17:15.760 they're actually on the heap it's just
00:17:17.439 that the jvm does such a good job of
00:17:19.280 optimizing that process we get the
00:17:21.799 benefit so let's look at the actual fu
00:17:24.559 class so here let's have a ruby based
00:17:26.559 initialize now you would expect that
00:17:28.400 this would slow things down now we have
00:17:30.720 to have the ruby call to initialize it's
00:17:33.360 not just going to be the default native
00:17:35.039 initialize anymore uh but again we have
00:17:38.480 our object new allocations versus foo on
00:17:41.919 ruby 3.4 the al the optimization that
00:17:45.840 aaron came up with opt new and and j
00:17:49.200 ruby has basically no impact by writing
00:17:52.080 that initialize in ruby we still end up
00:17:55.120 being significantly faster to allocate
00:17:57.360 objects and because of the great garbage
00:17:59.679 collection on the jvm those objects can
00:18:02.000 be cleaned up much faster than any of
00:18:04.080 the ones that are available for c ruby
00:18:05.679 right
00:18:07.559 now um
00:18:10.919 so there's there's also another
00:18:13.120 advantage here the fact that we're
00:18:14.320 inlining the object allocation inlining
00:18:16.559 all of that new logic means we can get
00:18:19.280 more advantage out of newer jvm jit
00:18:22.240 compilers like the growl jit uh growl is
00:18:25.679 very good at eliminating allocations
00:18:27.520 that aren't actually needed so if we run
00:18:29.919 our fu benchmark on there we run with
00:18:32.640 regular java 21 we get 216 million
00:18:36.280 allocations per second growl sees that
00:18:39.919 we're actually not using a whole lot of
00:18:41.440 these objects and eliminates most of
00:18:43.840 them so when you have code that is using
00:18:47.280 a little object temporarily and then
00:18:49.520 throwing it away we we may not even
00:18:52.400 allocate it if you have the right jit
00:18:54.000 compiler running underneath it this is
00:18:56.000 the advantage of the jvm being able to
00:18:58.160 inline and optimize all the way through
00:19:00.080 that
00:19:01.400 code uh so data was also introduced in
00:19:04.320 3.2 it's a new feature for us hopefully
00:19:07.039 some of you have been using it because
00:19:08.240 it's a very nice design uh it's a new
00:19:10.720 immutable strct type uh it's got compact
00:19:14.240 storage so it's not using standard
00:19:16.320 instance variables it's a little bit
00:19:17.760 lighter weight uh and it has a special
00:19:20.080 new method that kind of does that same
00:19:22.480 process allocate and initialize all at
00:19:24.640 once and gives you an immutable object
00:19:26.960 back that has the data in it so here we
00:19:29.600 have our bar class which is defined as a
00:19:32.640 a data that should be data.define define
00:19:34.960 and then we'll just allocate it quickly
00:19:37.039 with three values here we have c ruby
00:19:41.200 3.4 with yjit getting about 2.9 million
00:19:45.520 uh aaron's patch doesn't really apply
00:19:47.520 here because it's not the default class
00:19:50.640 new anymore it's a special new for data
00:19:53.440 uh so this is about as good as you can
00:19:55.120 get on current c ruby we run it on j
00:19:57.360 ruby we're getting almost almost three
00:19:59.440 times the performance again and it's
00:20:01.760 just because the jvm is able to optimize
00:20:03.520 those objects much
00:20:05.640 better so another area here optimizing
00:20:08.919 concurrency uh we continue to hear lots
00:20:11.600 about ractors and how they're going to
00:20:13.360 bring real concurrency to ruby uh they i
00:20:17.679 believe it's possible that ractors can
00:20:19.600 do that and can make it possible to use
00:20:21.919 all the cores in your system but writing
00:20:24.480 ractor friendly code is pretty
00:20:26.240 challenging and even the ruby core team
00:20:28.480 is still struggling with the core and
00:20:30.799 the standard library to make sure all of
00:20:33.039 that can work in ractors uh there's also
00:20:35.919 a high overhead crossing over that
00:20:37.919 ractor boundary you have to make sure
00:20:39.919 that these objects are sharable you have
00:20:42.240 to make sure they're frozen potentially
00:20:44.159 that means you need to be doing more
00:20:45.520 allocation to pass clean objects across
00:20:47.919 that boundary and so you lose a lot of
00:20:50.000 the benefit uh in j ruby you can just
00:20:52.880 use threads uh if you are using thread
00:20:56.080 safe code you pass those objects off to
00:20:58.320 another thread you can actually utilize
00:21:00.559 all the cores in the system without
00:21:02.320 making any major changes to your
00:21:04.679 application so a little benchmark uh
00:21:07.679 contributed by ma here we played around
00:21:10.480 with doing threads versus ractors to see
00:21:12.799 what kind of performance we'd get on c
00:21:14.559 ruby versus j ruby so for a certain
00:21:17.360 number of threads we're going to spin
00:21:19.200 them up they're going to sit and wait on
00:21:20.799 a queue to get data to get uh work to do
00:21:24.320 when they get that work they're going to
00:21:25.760 parse that json and then we'll deal with
00:21:27.919 it later so here is the benchmark if
00:21:30.960 we're running with threads we're just
00:21:32.799 going to feed chunks of that data into
00:21:34.960 the threads let them chew on it and
00:21:36.960 parse it uh otherwise we'll just do it
00:21:39.120 in a linear way
00:21:41.440 so we look at this we're looking at uh
00:21:44.159 improvement by using threads versus
00:21:46.159 ractors here so c ruby running with
00:21:48.720 threads gets a slight improvement uh not
00:21:52.240 a whole lot this probably is there's a
00:21:55.039 little bit of backend stuff there's a
00:21:56.480 few places where the lock can be
00:21:59.200 released for parts of json but most of
00:22:01.840 the time you're not getting the full
00:22:03.600 benefit of those threads and the cores
00:22:06.320 on your system uh now this is actually
00:22:09.600 an old number uh this is current ruby
00:22:13.240 3.4 when you try to run this same thing
00:22:15.600 with ractors it actually slows down the
00:22:18.480 overhead of crossing that ractor
00:22:20.240 boundary takes away all of the
00:22:21.919 performance benefits you have and
00:22:24.000 actually introduces a lot more overhead
00:22:26.240 uh but this is being patched it has been
00:22:28.559 improved in a recent patch that just
00:22:30.480 came through today so now we're up to
00:22:32.480 like a 1.3 times improvement uh and i
00:22:35.679 should note this is all on my system
00:22:37.600 which has four cores so ideally we'd be
00:22:40.240 expecting that we see a 4x improvement
00:22:42.480 by spreading it out across threads and
00:22:44.960 of course when we run it on j ruby real
00:22:47.360 threads that's actually what we get the
00:22:50.000 same code running on c ruby versus j
00:22:52.159 ruby you will see all the cores light up
00:22:54.400 you'll max out your system and you can
00:22:56.320 do all of your parallel processing in a
00:22:58.480 single process
00:23:01.280 this also translates to applications
00:23:03.520 like rails uh here is a graph showing
00:23:07.280 how many requests per second you can get
00:23:10.159 per megabyte of memory you throw at the
00:23:12.320 application in c ruby to get concurrency
00:23:15.360 in rails you need more processes each
00:23:18.640 process is going to have its own garbage
00:23:20.559 collector it's going to have a lot of
00:23:22.159 its own data you're going to be doing a
00:23:24.320 lot of duplicate operations and
00:23:26.240 duplicate round trips to the database
00:23:28.720 you don't get as much of a gain by
00:23:30.720 throwing all those processes with j ruby
00:23:33.600 you can have a single process run your
00:23:35.679 entire site and there's a crossover
00:23:38.880 point where we use so much less memory
00:23:41.520 than c ruby that you're getting much
00:23:43.520 better requests per second for the
00:23:45.360 resources that you're paying
00:23:48.200 for all right and so kind of a final
00:23:51.760 area that i wanted to mention uh we have
00:23:54.720 the optimizing of opportunities for ruby
00:23:57.760 to be used in the world uh j ruby runs
00:24:01.440 anywhere that java runs which of course
00:24:03.360 is just about everywhere uh rubists can
00:24:06.640 take the knowledge that they have take
00:24:08.400 their rails applications and their
00:24:09.919 libraries that they've created you can
00:24:11.760 deploy them in a java shop you can
00:24:14.000 package them up as a single binary and
00:24:16.480 ship it as a commercial piece of
00:24:17.919 software all of these things that are
00:24:20.240 still an open problem in the ruby world
00:24:23.120 you can go to the jvm community you can
00:24:25.600 go to these big enterprises and start
00:24:27.679 using ruby to build applications today
00:24:30.000 uh i've talked with two or three
00:24:31.600 different people today uh this this week
00:24:34.000 while we've been at the conference who
00:24:35.679 have actually packaged up rails apps
00:24:38.159 deployed them in a java organization and
00:24:40.559 nobody ever knew they were writing ruby
00:24:42.880 you can bring ruby anywhere that java
00:24:45.039 can go uh a little example here is using
00:24:48.720 the java swing library which is the guey
00:24:51.679 library that's built into the jdk uh we
00:24:54.720 create a frame we add a button to it uh
00:24:57.760 we set up an action listener for the
00:24:59.840 click event that will just change the
00:25:01.840 text of the button uh then we add the
00:25:04.960 button to the frame we display the whole
00:25:08.159 thing now we see our window will pop up
00:25:11.279 when we click on the button it goes back
00:25:13.360 it runs our ruby event handler and it
00:25:15.840 does what we expect it to do so with a
00:25:18.159 very small amount of ruby code and any
00:25:20.400 available jdk you can build a graphical
00:25:23.760 application that runs across platform
00:25:26.080 very
00:25:27.240 easily and of course this extends to
00:25:29.440 other things uh shoes 4 is based on j
00:25:32.559 ruby and it gets the advantage of
00:25:34.400 built-in graphical libraries glimmer is
00:25:37.120 a much more professional api for
00:25:39.440 building user interfaces that can run on
00:25:41.679 j ruby using another toolkit called swt
00:25:44.960 uh of course it wouldn't be a j ruby
00:25:47.039 talk without showing some of the
00:25:48.320 minecraft plug-in abilities here this is
00:25:51.039 a plugin written in ruby for minecraft
00:25:53.840 that just changes the number of chickens
00:25:55.840 that come out of an egg but all written
00:25:58.159 in ruby all plugged into the jvm because
00:26:00.640 we all just live under that same runtime
00:26:03.919 and of course android is largely java
00:26:06.559 based so we do have a framework for
00:26:08.720 building android applications in ruby uh
00:26:10.960 it's called rub and here we have a
00:26:13.600 couple different examples of using an
00:26:15.799 interactive irb session within an
00:26:19.120 android application uh loading up
00:26:21.200 scripts and running them and there
00:26:23.360 actually are commercial users out there
00:26:25.600 who use j ruby on android for point of
00:26:28.480 sale devices uh ad terminals all sorts
00:26:31.200 of other applications
00:26:34.240 so looking at j ruby's future here so j
00:26:38.400 ruby 10.x the maintenance uh versions
00:26:40.960 that we've got coming up in the next
00:26:42.480 year uh lots more optimizations coming
00:26:46.000 now that we have the release out the
00:26:47.679 door i've got a long list of things that
00:26:50.159 i know we can do better i know we can
00:26:52.320 optimize more completely uh for example
00:26:54.799 keyword arguments are still a little bit
00:26:56.559 slow we still allocate a hash j the jvm
00:26:59.760 is good at getting rid of some of those
00:27:00.960 allocations but ideally if we don't need
00:27:03.120 to create a whole hash to pass keyword
00:27:05.039 arguments we shouldn't do that so that's
00:27:07.120 one of the first things i'm going to be
00:27:08.400 working on uh if you find a specific
00:27:10.960 case that's slower in j ruby than it is
00:27:13.279 in c ruby let me know because there's
00:27:15.679 probably something wrong in j ruby that
00:27:17.679 i can fix uh we're also going to be
00:27:20.159 exploring all of these different jdk
00:27:21.919 features like aot cache to improve
00:27:24.080 startup uh project panama to make
00:27:26.559 optimized native calls from ruby code so
00:27:29.039 that our ffi is faster uh and really
00:27:31.919 it's just kind of going to be a matter
00:27:33.200 of upgrading your jvm you upgrade the
00:27:35.520 jvm you get better garbage collectors
00:27:37.919 better jit cool new features and your
00:27:40.400 ruby code runs faster and you don't have
00:27:42.080 to do anything
00:27:43.240 else so the other side of this uh i need
00:27:46.640 the ruby community to help me keep
00:27:49.039 helping you uh i really believe j ruby
00:27:52.080 can solve many of the problems that the
00:27:54.000 ruby community has performance-wise
00:27:56.640 getting it into difficult enterprise
00:27:59.159 organizations uh i if j ruby goes away i
00:28:02.799 think ruby would be immeasurably damaged
00:28:05.679 this is an opportunity for us to expand
00:28:08.320 ruby into a much larger world to be able
00:28:11.039 to do high per high performance high
00:28:13.520 concurrency processing of data all over
00:28:15.919 the world just using the ruby that we
00:28:18.080 love to use uh so please i would love
00:28:21.120 for you to try j ruby let me know how i
00:28:24.320 can help you how i can help you uh scale
00:28:26.720 it pro profile it optimize your code uh
00:28:29.760 and if you do use j ruby please let's
00:28:32.080 talk about some sort of partnership
00:28:33.840 where i can help you run your
00:28:35.919 application better and you can help the
00:28:37.600 j ruby community stay healthy so thank
00:28:44.120 you i will have j ruby office hours
00:28:47.200 after this you can find me uh and if you
00:28:50.000 come by with an interesting problem to
00:28:51.840 solve maybe i'll give you one of the
00:28:53.679 american ipas i brought with me uh but i
00:28:56.640 also have stickers and business cards
00:28:58.480 for anyone who wants them thank you
Explore all talks recorded at RubyKaigi 2025
+66