00:00:16.440
everyone so um today i'm going to talk about uh build system of
00:00:22.760
crv so as you know cb has its own long history and it's bu its build system has
00:00:30.720
evolved over many years so let me share the uh its current state and challenges
00:00:37.520
and potential future directions so uh let me briefly
00:00:44.320
introduce myself uh i'm uta um um fortunately i successfully gra uh
00:00:51.360
completed my master degree so uh now i'm no longer student so currently i'm work
00:00:58.640
working at uh goodnotes as a software engineer and i'm working on making swift
00:01:03.920
language available on web and um so yep
00:01:09.320
so i'm also an committer to cub and swift and working on web assembly
00:01:16.080
support for both languages so as a hobby i'm working on ruby.wazm was
00:01:24.439
staff so today i will start with introduction to how suv is built then
00:01:30.880
walk through the current uh build process in detail after that i will
00:01:35.920
discuss the challenges we face today and finally i will share my vision
00:01:42.720
and for uh for the future
00:01:47.840
so let's start why i am talking about the build
00:01:54.040
system so my interest in the c ruby build system started when i was working
00:02:00.479
on ruby so while porting c ruby to web assembly
00:02:06.719
uh i encountered many many challenges in the build system and uh the experience
00:02:12.239
made me uh thinking that the current build system had uh has many
00:02:18.920
uh uh improvement space and to make it more maintainable and e efficient
00:02:27.360
so i believe that addressing these challenges we can make um the build
00:02:32.720
build process better for everyone in the ruby community so i am motivated in
00:02:39.800
this so uh when you when you want to ruby uh want to install ruby uh how do
00:02:47.120
you usually do it so you probably use one of these version managers so they
00:02:54.000
make it easy to install version on your machine but have you ever wondered what
00:02:59.280
these tools actually do under under the hood so let's look at the fundamental
00:03:04.879
build process they all use so this is a basic build process uh
00:03:12.080
that all version managers wrap around so it's quite straightforward and
00:03:18.720
download that download a turbo from ruby rang or then run configure and make
00:03:25.519
that's it so uh the configure script checks your system and generate the necessary
00:03:32.879
build files then make build the interpreter and extensions and finally
00:03:39.840
make install puts everything in the right place so on windows the process is
00:03:45.760
slightly different using configure bot and animate
00:03:52.040
instead but uh so while most users will build from tles um there's another way
00:04:00.080
to build ruby directly from the gource so this is what you will do when
00:04:08.799
working uh working on the cv development so this process looks very similar but
00:04:16.079
the there are two important uh there are several important difference uh between
00:04:22.880
uh from tabo so first uh you need to have ruby
00:04:28.600
installed so on your machine uh you have you need to have ruby installed so we
00:04:35.440
call this base ruby so second uh you need to run ogen to generate the
00:04:42.639
configure script because it's not tracked in the g
00:04:48.280
repository so what's the difference between building from towel and g source
00:04:54.720
well um while the while the build command looks pretty similar uh there is
00:05:01.520
actually a crucial difference for in how we handle code generation so the tab
00:05:08.800
must must be completely self-contained and it cannot depend on uh having ruby
00:05:16.039
installed so this is because uh because of what we call bootstrap
00:05:22.039
problem uh so how do you build ruby when you don't have ruby so to solve this uh
00:05:29.600
we pre-generate all the files that would normally uh require ruby to generate and also
00:05:38.400
include them in the table so these files include things like uh generated c files
00:05:45.120
for parser pre-ompiled rb inc and so on
00:05:53.120
so we have seen how ruby is build built today and it might seem
00:05:59.000
straightforward so just configure and make so that's it but how complex is it
00:06:06.680
really so uh but um so let's measure various aspect of the ruby uh build
00:06:14.479
system to understand the current state so to understand the build process uh we
00:06:21.280
need to analyze the build process in detail so i made a tool called trace
00:06:26.400
make uh that helps us understand what's h happening during the build so it works
00:06:32.400
by replacing the default gnu make shell uh with our custom shell and it records
00:06:39.199
the detail timing information and this gives us a tra uh trace of every command
00:06:45.440
executed during the build and we can visualize it using uh chrome trace
00:06:52.759
viewer so this is a trace of a single build process
00:06:58.560
so the build process can be divided into three main phases configure make and
00:07:06.120
install uh the configure phase uh is a single long running shell script that
00:07:12.800
checks your system features and looking at the make phase uh we can see a lot of
00:07:19.440
parallel compression tasks um i run this build on a 16 core machine
00:07:27.039
and gcc is running many times simultaneously and finally the install
00:07:32.880
phase copies the build files to the final location so uh okay so now we have a way
00:07:42.160
to analyze the build trace so
00:07:48.039
um but uh let's look at how build times uh have been changed across ruby
00:07:53.800
versions so for this analysis i used uh i used the all ruby project uh created
00:08:01.360
by akl um this project is really really helpful because it provides consistent
00:08:07.120
way to build different ruby versions and it includes all necessary patches and
00:08:12.240
build scripts to make older versions buildable on modern machine so i
00:08:17.520
modified a bit uh a bit to collect more information but it's work working pretty
00:08:23.680
well so looking at the configure time across
00:08:29.599
ruby versions uh we can see clear upward trends and this is mainly because we
00:08:36.320
have been adding more more and more feature detection check over time and
00:08:41.839
for example when we uh we need to check uh for new system feature library and
00:08:48.240
compiler capability whenever we add a new platform support or uh start using a
00:08:54.399
new feature and each of these check is done sequentially which explain mean why the
00:09:01.279
configure phase is taking longer and longer uh
00:09:06.440
linearly so here is another interesting data
00:09:12.000
point uh this graph shows the number of line in the configure ac over
00:09:18.839
time so it's approaching 6,000 lines now
00:09:25.920
and what's interesting is that this growth trend looks very similar to the
00:09:31.200
configure time graph uh we just saw so so this correlation makes sense because
00:09:38.399
more feature detection code in here.ac directory re directory leads to more
00:09:44.959
checks being run during the configure configure time so yeah this is why this
00:09:51.920
is one of the reason why we need to think about better way to handle feature
00:09:57.480
detection by the way uh do you know who is the most active commiter of uh
00:10:04.440
configure.ac do you know any guess
00:10:11.839
yes you know so clearly no has made sign significantly more commit than anyone
00:10:19.880
else so about five times more than the second most active
00:10:25.720
committer uh so this gives uh this shows knob's uh incredible dedication to the
00:10:32.959
to maintaining the build system but yeah uh so it also highlights a potential
00:10:39.360
issue uh we are heavily dependent on a single person for this critical part of
00:10:44.959
our infra infrastructure so this is another reason why we should
00:10:50.560
think making more maintainable and uh also let's see the
00:10:58.880
make uh the make phase also shows a natural
00:11:04.399
growth trend um as we add more components and feature compression time
00:11:10.800
increase accordingly um but uh back in the early days ruby took less than 10
00:11:18.959
seconds to build can you believe that so now uh with ruby 3.3 and later
00:11:28.959
we are looking at over 5 minutes uh but note that uh this build
00:11:34.720
was done with j1 because uh our version of ruby build script cannot handle par
00:11:41.040
compression well so uh to be fair i use j1 so it's taking longer uh
00:11:50.200
than what you see in your machine
00:11:55.640
so so okay so now let's dive into how we
00:12:00.880
build uh how the build process actually works so uh this is quite complex so i will
00:12:07.519
break down step by step so let's start with make so you
00:12:15.920
might think uh we just use gnu make but actually uh we we support multiple make
00:12:24.040
implementations gu make is the main one of course but uh yes uh um here is uh
00:12:32.079
here is tricky part uh we need to support really wide range of versions
00:12:37.680
and the latest good new make is 4.4 4 but uh mac os still sips the sips with
00:12:51.680
that too and also that's not all uh we also support bsd make and make for
00:12:59.959
windows uh supporting all these uh different version and implementations
00:13:06.079
uh really complicate our build system but uh yeah uh as a fun fact uh
00:13:13.920
uh my very first commit to c ruby was about fixing a build issue with all old
00:13:20.240
version of good make so it's
00:13:26.200
fun so let's trace the build process step by step so compare step is uh where things
00:13:34.000
start getting interesting so on unix like systems we use o tools which
00:13:40.480
generate the configure script uh from configure.ac uh on windows we have
00:13:48.000
completely separate configure.bat the configure step generates several important files
00:13:54.000
including make file and config config.h uh note that we generate make
00:14:00.399
file make file and uncommon mk at this step
00:14:08.880
and we have interesting in other words tricky approach to handle different make
00:14:15.720
implementations so we have common mk file in our source tree and that
00:14:21.040
contains rules that work with enmake syntax and for other uh for other make
00:14:29.279
implementation other than nmake we prop we prep this file to generate uncommon
00:14:36.959
mk while stinging out nmake specific
00:14:44.360
features so let's look at how make files are organized for gnu
00:14:49.880
make so here is uh uh here we have both go make file and make file but gu make
00:14:58.000
uh prefer make file as the root make file so the root the root gu make file
00:15:05.440
includes several other make files the base make file which is shared uh shared
00:15:11.279
with other nonmake makes and uncommon mk which we saw
00:15:17.560
earlier and gnu make uh gmake.m mk uh
00:15:22.880
which is actually uh we usually put gnu make specific definitions and why it
00:15:29.519
make is also included through this include
00:15:34.839
chain and for windows uh is in make um
00:15:40.320
we have different include structure uh the make file here is make file the root
00:15:46.399
make file here is make file and it includes win32 make file.sub which then
00:15:53.199
includes setup mk um and also it includes common.mk as is
00:16:01.759
without any pre-processing and we still have the other case so uh
00:16:11.440
for other make implementations other uh like um like bsd make uh we take another
00:16:19.000
approach so the configure script just concatenate the common make and widget
00:16:26.079
make file and um the base make file uh without include
00:16:31.800
so yeah so we have three different includes patterns for different make
00:16:36.839
implementations so yeah so then uh let's go through the
00:16:44.800
actual make step so this is what actually happens when you run
00:16:50.120
make so first we compile the interpreter core um all those c files you see on the
00:16:58.279
left um then we build mini ruby which is a
00:17:04.160
minimal version of ruby that can run very basic ruby code after that uh we
00:17:10.319
use mini ruby to configure and compile extension libraries and finally link
00:17:16.480
everything together into the final ruby ri binary so compiling c file is
00:17:23.400
straightforward so nothing special here but you might notice that i mentioned
00:17:28.880
mini ruby here uh that actually another tricky part of the
00:17:37.400
process so yes uh we actually use several different ruby variation during
00:17:43.200
the build process so each one serves a specific purpose um each one start um so
00:17:51.440
we start with b uh base ruby which we which is uh ruby already installed on
00:17:58.080
your system and then we build mini ruby which is minimal version of uh ruby but cannot
00:18:06.799
use extension library and uh yeah so there are still
00:18:12.000
other rubies but the relationship between these rubies uh gets a bit
00:18:17.840
complex when cross compiling so yeah let's skip them
00:18:23.880
today so yeah so we use uh ruby for various code generation task during the
00:18:29.840
build process and that's what base ruby is for
00:18:35.840
uh it handles things like converting ruby files to c and preprocess uh processing erv templates and recently
00:18:43.760
pass is also added to the list by
00:18:49.480
dmr and mini on the other hand is mainly used for uh configuring the extension
00:18:55.919
library during the during the build so its main role is to run xcon to generate
00:19:02.320
make files for each extension library so while min has its minimal functionality
00:19:09.200
it has enough feature to run mkf so
00:19:14.280
when yeah but uh when cross compiling uh since we cannot run mini ruby compiled
00:19:21.440
for target environment so base ruby is used here instead but yeah anyway it's
00:19:29.360
called mini ruby appase so let's go back to the build
00:19:36.720
place so now let's look at how the build extension how we build extension
00:19:44.679
library so this diagram shows uh how extension library are built using
00:19:52.160
recursive make so when you make when you run make it first compiles interpreter
00:19:59.520
core then it enters a recursive process of the build uh to build
00:20:05.240
extensions so first it configure all extensions by running make with
00:20:11.559
configure x make next it compiles the extensions using x
00:20:18.679
make which it which in turns uh runs the
00:20:23.840
make for each extensions then finally it links all the compiled object back to the
00:20:32.120
interpreter so this is literally a recursive make so the process start with
00:20:39.679
the root make file and eventually it enters the root make file
00:20:48.520
again so now uh we understand how the build system work and let's look at the
00:20:56.159
challenge we face today so our first challenge is the raid
00:21:02.480
number of build var var build variations so we need to support uh v uh
00:21:10.080
so many kinds of build variations and we have to handle different platform each
00:21:15.360
for um so and we support both inry and out
00:21:21.200
of source build which adds complexity to pass handling and we also need to
00:21:27.520
support both dynamic and static linking in uh rib ruby and the availability of
00:21:35.039
uh base ruby can affect the build process so many build variations itself
00:21:41.520
uh yeah it the many build variation itself is fine but the problem is that
00:21:48.799
uh we have to handle them in both uh configure and make and and other several
00:21:54.720
places so so the problem are spread uh widely in the
00:22:05.480
codebase so the second challenge is uh what we call make time make time dynamism
00:22:12.440
so uh our build system uses various make feature to handle different build
00:22:18.520
configurations and while these feature are powerful uh they make the build
00:22:24.320
process uh hard to debug and understand so let me show you some
00:22:29.880
example so here's one example of make time dynamism so we use this yes no
00:22:36.320
pattern to conditionally enable and disable targets so why it's pretty
00:22:42.640
clever uh but it makes it hard to track what what targets are actually uh
00:22:48.880
available and when they were run um so this pattern is used throughout
00:22:56.559
our build process adding layer of
00:23:01.880
complexity so here's another another example uh this code handles uh handles
00:23:08.000
the generation of a c file but uh differently depending on the whether we
00:23:13.679
are cross compiling but the the condition itself is quite complex but the pro main
00:23:20.640
problem is uh there is no clear way to check which path it uh build actually
00:23:26.799
took without running the actual
00:23:32.120
build so root cause of those uh root cause of uh all of this problem is
00:23:40.760
uh actually uh the limited expressivity of configure.ac ac i think so because we
00:23:50.240
cannot express all our build logic in the configure phase we ended up pushing
00:23:56.559
a lot of complexity into the make phase and this makes this makes the build
00:24:03.200
process harder to understand and
00:24:08.679
debug and also uh do uh remember how we talked about base ruby earlier and here
00:24:16.720
is where it become a problem so even though turbo will build should not
00:24:22.640
depend on the best base ruby they sometimes do when best ruby is
00:24:29.080
available so this has caused real issue in the past and the problem is that we
00:24:35.600
cannot easily verify that a turbo build truly doesn't depend on base ruby so we
00:24:42.640
need a way to validate our build plans to catch these issues before uh before
00:24:49.120
they affect users so finally let's talk about the
00:24:55.120
build time so as we saw the data analysis earlier uh both configure and make make phases
00:25:03.279
are getting slower and the configure script is essentially one long uh
00:25:09.520
sequential shell script so and these checks uh could be
00:25:15.880
theoretically be paralized but the current auto tools based system makes it
00:25:21.840
makes it difficult and also the make phase has
00:25:28.799
its own performance issue so uh this diagram shows its impact on
00:25:35.200
build time so those white those white space in this graph represent um idle
00:25:43.200
time where some cores weren't being used so see those blue blue hatched areas so
00:25:52.760
uh that's where we could improve by scheduling better
00:25:58.000
so the problem is that we cannot start building any extension libraries um
00:26:03.919
until all of the uh extension library are configured so this stage wise dependency
00:26:12.559
creates a significant bottleneck in our build so uh now we have seen the
00:26:21.120
challenges so let's talk about how we can improve it
00:26:26.400
so let's start with the complexity issue so the key idea here is uh two so first
00:26:33.600
uh we can move the build build logic from make and m4 to ruby so this allows
00:26:40.960
us express our build build logic um more clearly and in a real language
00:26:47.440
real programming language instead of spreading it across uh configure and
00:26:52.640
make file and so on so yeah and
00:26:58.120
for so basically uh unintended uh dependency introduction um the build
00:27:06.240
system should be able to validate the build plan before executing it and for performance improvement we
00:27:14.320
need to address both the configure and make phases by integrating feature
00:27:19.840
detection into the build system itself um we can parallelize these checks and
00:27:27.840
but and also by moving away from recursive make to a single build graph we can optimize task scheduling across
00:27:34.880
the entire build so eliminating those i uh it will result in eliminating those
00:27:42.080
idle time we saw earlier so uh with those goals in mind i
00:27:49.200
had been working on a new build system um it's still very much in a work in
00:27:55.760
progress but let me show you what it looks like today and how we how we
00:28:02.480
addresses our challenges so the vision for this new build system has three key points first
00:28:10.080
uh we use ruby to construct the entire build graph giving us
00:28:15.159
expressivity we need and second we make the build system available as a
00:28:21.000
library and third despite using ruby for development um we ensure that the end
00:28:28.600
user will not use uh we not need to install ruby to build ruby
00:28:36.679
so here is an simple example what the uh new build system looks like the build
00:28:43.919
description is written in ruby which makes it easy to understand modify and
00:28:49.440
the build description is essentially responsible for constructing the build task
00:28:56.520
graph and here is more complex example that shows simplified version of mini
00:29:02.000
ruby build so notice um how the feature uh checks
00:29:08.159
are just a regular task in the build graph uh instead of sequential share
00:29:13.960
script and this means they can be paralized
00:29:19.240
automatically so and we integrate erv into the build system
00:29:25.399
itself and so we can generate file files like config.h h based on the feature
00:29:31.760
check within the build system and let me highlight two uh two
00:29:36.960
key features that addresses our main challenges first uh we convert our bu uh
00:29:42.640
ruby based build build description into a traditional auto tools like uh
00:29:48.399
configure script and this means that end user will not need to rub installed at
00:29:56.000
build time so they will get the same uh well familiar configure make
00:30:03.480
experience so you might think this has uh this is something cmake and mason can
00:30:08.960
do but it's not because they uh because uh their make files are not portable at
00:30:15.919
all so we cannot distribute to the end users and second because uh the build
00:30:21.279
system in uh is a library we can analyze the build graph programmatically so this
00:30:27.760
allows us to uh check the check if uh we have
00:30:34.600
unintended base dependency before they cause problem uh making it making it as a
00:30:42.000
library also allows us to easily integrate the c rub build into other
00:30:48.320
project like rubywin or yet another single binary package man packager
00:30:55.600
so yeah uh of course we uh there are still a lot of things to do uh but uh
00:31:02.960
yeah we need we need better documentation windows support and way to handle dynamic dependency but yeah i
00:31:11.039
believe this approach addresses the fundamental issue uh we discussed earlier and as a first step i'm working
00:31:17.600
on making it capable of building cub so okay so today i look at how cub
00:31:26.640
build system has evolved over the year and uh the challenges we face and i
00:31:33.679
showed a new build system that aims to address this issue but it's still work
00:31:39.039
uh very much work in progress so yeah so i tried my best to structure my
00:31:45.440
explanation but it might have ended up just dumping my head so if anything
00:31:51.679
wasn't clear uh or you have any questions please please please feel free ask me anything so that's all thank you