Summarized using AI

Could a Machine ever write tests for our code

Loren Segal • November 01, 2012 • Denver, Colorado • Talk

In this talk presented at RubyConf 2012, Loren Segal discusses the potential for machines to automate the writing of tests for Ruby code through formal verification techniques. The session centers around a research project called RubyCorrect, which integrates advanced verification methods into the Ruby programming language, focusing on two primary tools: Ruby ESC (Extended Static Checking) and Ruby Kon (Symbolic Execution).

Key Points:

  • Formal Verification Overview: The speaker introduces formal verification as a methodology that uses mathematical logic and theorem proving to ascertain program correctness, emphasizing its boring yet essential details. The discussed methodologies include static and runtime checks, highlighting Extended Static Checking (ESC) and Symbolic Execution as the focus of the talk.

  • Extended Static Checking (ESC): Segal explains how Ruby ESC translates Ruby methods into logical expressions. It verifies code correctness based on specified preconditions and postconditions, utilizing Boolean algebra. As an application, he walks through a Fibonacci sequence example, demonstrating how Ruby ESC identifies precondition violations when changes are made to the code.

  • Symbolic Execution: In contrast to ESC, which needs established contracts for testing, symbolic execution operates without them — making it more flexible. This method allows the analysis of Ruby code by substituting symbolic values for concrete ones and automatically generating test cases by exploring various execution paths.

  • Demonstrations: Segal provides demonstrations of both Ruby ESC and symbolic execution, showcasing how each tool analyzes and validates code. He generates tests for the Fibonacci sequence using symbolic execution, illustrating its capacity to handle various input scenarios effectively.

  • Limitations and Future Work: The speaker candidly discusses limitations of the current Ruby ESC implementation, including incomplete coverage of Ruby's standard library and challenges with dynamic code. For symbolic execution, he acknowledges the difficulty in statically analyzing certain Ruby features and the reliance on annotations. He suggests future directions for improving these tools, including better type inference, implementing Ruby abstractions in Myra, and writing a dedicated symbolic execution engine for Ruby.

Conclusions:

The key takeaway from the presentation is the feasibility of automating some aspects of test generation through machine learning in Ruby. However, while such technologies can support developers, they are not yet fully reliable and come with inherent limitations. The deeper question remains whether these automated systems can be entirely adopted in practical development environments, particularly in handling dynamic aspects of Ruby code.

Could a Machine ever write tests for our code
Loren Segal • Denver, Colorado • Talk

Date: November 01, 2012
Published: March 19, 2013
Announced: unknown

TDD is a great way to test code, but have you ever wondered if there are ways to leverage the awesome power of computers and help us write better tests? Research in the field of formal verification has shown promising results with tools that analyze programs for logic errors and can even figure out what input values caused those failures. However, until now, none of that research was ever used with Ruby. This talk discusses RubyCorrect, a research project that attempts to apply verification techniques like "extended static checking" and "symbolic execution" to the world of Ruby programs. We look at how these techniques work and how they could potentially improve the kinds of program faults we can detect. Machines that write our tests? So crazy that it just might work!

RubyConf 2012

00:00:15.360 uh so I'm going to be talking about verification techniques in Ruby and asking the question could a machine ever
00:00:22.000 right test for our code so to start I want to ask you guys a question so looking at this picture
00:00:30.039 can you tell if this is a sunrise or a sunset think about that to yourself
00:00:35.440 we're going to come back to that when we later on so my name is Lauren seagull I've
00:00:43.480 been working on a masters in formal verification research at Concordia University of Montreal Canada it's kind
00:00:49.920 of what we're going to be talking about today um I also wrote a documentation tool called yard I don't know if you
00:00:54.960 guys guys have heard about it than
00:01:01.120 um we will be using it in The Talk The Tool actually uses it so we'll be looking at that I'm also on GitHub and
00:01:07.159 Twitter so if you want to talk with me after the talk and chat with me check out my code it's all
00:01:14.720 there so I developed a tool called Ruby correct and I should preface this entire
00:01:19.920 talk with the fact that this is a real fru concept it's not really production ready um but the tool set has uh two
00:01:28.600 tools inside of it or program verification in Ruby the first tool is called Ruby ESC
00:01:34.840 which does extended static checking and the second one is called Ruby Kon which does symbolic execution and uh we'll
00:01:41.560 talk about those to understand what extented stack checking and symbolic execution are we need to understand what
00:01:48.079 how formal verification works and basis of formal verification the problem here is that formal verification is
00:01:55.000 boring there are a lot of details um but we will have to be
00:02:00.759 skipping most of them but not all of them so with that what is formal
00:02:08.399 verification formal verification is basically a set of methodologies using logic and Theorem proofs to verify
00:02:15.640 program correctness in other words we use math or bolean algebra to verify
00:02:21.800 that your program does what you wanted it to uh some of the methodologies include
00:02:28.120 extended tag checking which we'll be looking at symbolic execution which we'll be looking at there are other ones
00:02:33.959 like model checking runtime checking and others uh to give a brief landscape of
00:02:40.360 what the Spectrum looks like so there's a stat we have static verification we have runtime
00:02:46.080 verification um the two ones that we we will be looking at today are ESC and
00:02:51.319 symbolic execution ESC is very much a static verification kind of concept
00:02:57.560 symbolic execution actually sits between static runtime checking and we'll talk about how those two interplay with each
00:03:03.200 other with ESC and what the similarities and differences are with those
00:03:08.560 two so let's start with ESC ESC is is extended stag checking as I just pointed
00:03:14.840 out is static verification for code what we do is we we translate a method into a single logical expression
00:03:21.920 so think boole alra um we can confirm that that method is correct assuming a given precondition
00:03:29.400 that postcondition match matches The Logical steps in that meod or in other
00:03:34.959 words given a set of logical preconditions when I execute a method the result should be equivalent
00:03:40.680 equivalent to my post condition so preconditions and post
00:03:46.080 conditions um as I mentioned they're pretty much logical expressions in the form of ring algebra uh P implies q and
00:03:53.519 R stuff like that we assume that the precondition is correct and this is how we typically
00:04:00.000 pree and post condition we assume that the precondition is correct and assuming that the precondition is is is valid we
00:04:07.840 assert that the post condition will uh equal be equivalent to what our what we set our method should be so for an
00:04:15.120 example of that this is the Fibonacci sequence we can write this as pre- and
00:04:20.880 post conditions and we can have a precondition for Fibonacci FIB n saying
00:04:26.080 that n is greater or equal to zero uh we can also have two post conditions the first post condition says
00:04:32.479 that our results is going to be n if n is less than two and we have a second post condition that says our result will
00:04:39.560 be FIB of nus1 plus FIB of nus 2 if n is greater or equal to two and that those
00:04:44.680 are our pre and plus condition in other words what we're really doing is we're doing design by
00:04:50.360 contract I talked about contracts a lot when I talked about yard um the reason I
00:04:55.840 like contracts is that contracts are effectively specifications and specifications are effectively
00:05:01.600 documentation so I like documentation a lot um and this is typically how you
00:05:07.479 would mark up in yard a specification for a ruby ESC tool you can see that pre
00:05:13.600 and post conditions are specified in pretty much of similar syntax over there and as well as the type annotations to
00:05:19.919 tell it that we're using new numbers um so the good thing about writing this out in documentation is
00:05:26.919 that I also talk about this lot auditing and i' I've talked about this in a bunch of yard talks that I've given in the
00:05:33.479 past um I've always wanted a tool that can read documentation and verify its
00:05:41.319 correctness and so I actually got to be be able to build that tool or through the ESC so I'm just going to jump into a
00:05:48.759 quick little demo here um I'm going to pop up this Fibonacci code right
00:05:55.960 here uh as you can see it's pretty much the same code that we just saw in the
00:06:02.039 slide we can actually execute this through our Ruby ESC tool so we can
00:06:07.759 go and run it through the
00:06:12.960 tool and so it's going to tell us that you know it's right but we can go in and
00:06:18.199 make a change and break it and see what happens so we'll do FIB of n minus 3 instead of f withus 2 that should break
00:06:25.479 it let's see what happens
00:06:33.000 so oh there we go and so it's running through and it's going
00:06:39.479 to I did not press enter there we go try all right so um this time it had four
00:06:46.759 errors one of them is that we violated a precondition so this is something uh I
00:06:53.280 didn't actually noticed when I first made this change but when you actually set P of n minus 3 you're actually
00:06:58.360 violating post condition when Nal 2 so when n equals 2 you're actually passing P of1 which violates a precondition so
00:07:06.000 we violated our precondition we've also violated post condition but that's a little more obvious we're not implementing Trion sequence as
00:07:13.639 specified so that's Ruby ESC in general that's sort of how it works that's
00:07:19.879 that's what it is basically but how does it work that's the question um so what we do for Ruby es is
00:07:27.759 we take Ruby code and we translate it into a language called Boogie and then
00:07:33.360 Boogie takes that code and translate it translates it to a for a the improver to
00:07:38.879 parse and uh perform the real heavy lifting so let's go through these steps
00:07:44.440 one by one and figure out what E1 is first we have Ruby well we all know what
00:07:49.879 ruby is so let's move on um Boi Hogie is an intermediate
00:07:55.520 verification language it's also a tool by the same name so you can run Boogie code using the Boogie
00:08:01.919 language uh it's a Microsoft research project it's probably one of the only Microsoft projects I like um you can try
00:08:09.720 out actually right now if you are bored in the top you go to rhone.com boogy and you can actually type in code and um
00:08:15.840 play around with you know that syntax and play around with verifying pre conditions and post conditions and stuff
00:08:21.639 it's an open source project which is good um like a real open source project
00:08:26.720 and uh it's the syntax is actually very SE like so it's actually a very high level language you can see here it pretty much
00:08:33.719 looks very SE like um the only difference is that we have pre and post
00:08:38.919 conditions specified on the method itself so that's that's kind of what Boogie looks like and now fear
00:08:46.600 improvers uh there are a bunch of fear improvers out there simplify CDC Isabel some of which you might might have heard
00:08:52.720 of if you've done verification before Boogie uses Z3 and so we use Z3
00:08:59.519 it's also a Microsoft research project that was actually recently open sourced I think last month so how do the theorum Pro
00:09:08.600 work well Z3 specifically actually uses a list FL syntax in fact it's pretty
00:09:13.640 much list to express logical statements so it's going to convert that highle c
00:09:18.680 code into a list s expression and then assert that statement we basically
00:09:24.680 assert a bunch of statements and then some magic happens in the the improver it's verifi that all the Expressions
00:09:30.440 that we' asserted are um do not contradict one another and are consistent and if it can prove that then
00:09:36.720 it can tell us that our equation is satisfiable if it can't then it tells us it's not and there is a case where it
00:09:41.800 can tell us if you can't do anything in that case it says unknown so that's a thear
00:09:49.160 improvers so I skipped over a dir little secret before and that is that we use type
00:09:54.640 annotations type annotations are required in through the ESC right now it's
00:09:59.839 kind of a necessity uh the good news is that most of it is an implementation detail we can
00:10:05.320 get rid of a lot of typing with type inference and uh we don't actually support it yet but that's that's
00:10:11.600 possible uh you can't get rid of all type annotations it is static analysis so we do need to tell the our tooling
00:10:17.839 what is they're we're operating on So speaking of
00:10:23.240 annotations I just want to plug yard on one more time um you can grab yard at
00:10:28.720 y.org we we use yard to annotate all the types and contracts as you saw up there if the
00:10:36.000 good thing about this is if you have method documentation that you've already marked up with types you are good to go
00:10:41.480 except for the contract part arguably the harder part but you're halfway
00:10:46.600 there s ends might plug one more sunrise sunset question for
00:10:52.079 you think about that one whether you think that's a sunrise or Sunset it's a little easier I'll give you a little
00:10:57.279 hint it's all about context so moving on Ruby to Boogie
00:11:02.959 translation so I mentioned We're translating Ruby into Boogie and then
00:11:08.600 Boogie is taking nap and doing its own thing with it let's talk about how we translate Ruby into
00:11:14.040 Boogie the first thing we do is we translate method control club and that's
00:11:19.480 sort of like basically translating Ruby syntax into Boogie syntax and that's mostly fairly simple Ruby Ruby has a
00:11:27.240 fairly well we know Ruby has a high language but Boogie also has a fairly high level syntax we can map it fairly
00:11:35.800 equivalently so the only difference here is that we have a procedure call instead of a depth and we have to pass in self
00:11:42.800 as an argument because there is no self or this keyword in B B has no concept of object
00:11:48.720 orientation uh and the only difference is really the way we return return
00:11:54.240 values the second thing we do is we map the object system so boie as I mentioned
00:12:00.120 just now has a different kind of object system than than than Ruby does so we have to define a reference type called
00:12:06.560 value that references all objects we use the the name value because that's what
00:12:11.839 the MRI C Ruby implementation if you've ever looked at a c code that's what that's the name for all the value
00:12:18.800 references the good thing here is that everything's an object in Ruby so pretty much all variables are passed in as
00:12:26.040 values the problem is that Boogie actually requires native types at some point at some point we're operating on
00:12:33.399 some kind of scaler types typically everything in Computing
00:12:39.040 ends up narrowing down to some kind of integer mathematic so the in value is pretty much the most important
00:12:45.639 one so we actually make a special exception for integers and we actually Alias int and value as the same type so
00:12:53.959 an integer is a value and an value is an integer we can get away with this because luckily it's actually the same
00:13:00.360 trick that c Ruby does to implement into fixed nums and and objects so we
00:13:06.040 actually partition the object spaced into integers and and object referen so values integers same
00:13:15.279 thing the next thing we do is we map method calls to procedures this is the hard step this is
00:13:22.360 where we have to perform static analysis we have to resolve the method at translation time which means we need to
00:13:27.399 know the type of the receiver that we're calling the method on once we can figure out the type of the receiver which if we
00:13:34.199 have annotations is pretty easy um we have to perform Ruby's method lookup
00:13:40.240 dispatch kind of code that it does in The Interpreter which is looking up the inheritance chain as well as
00:13:47.120 mixins so that is one of the harder steps we then have to handle lambas and
00:13:53.959 blocks so the way we do this is we convert Lambda blocks into anonymous Boogie procedures and we call them as if
00:14:00.600 they were Anonymous procedures uh the only problem here is that we have to infer extra contracts I
00:14:07.160 will point this out in a bit but every time you have a different method in Boogie it has to have have its own contracts so we have to infer some
00:14:14.720 contracts for the for the block itself and pull that out into a separate
00:14:20.959 method the next thing we do is we handle loops and this is where it gets crazy
00:14:26.480 because Boogie needs invariance to find all Lo structures and that means we as
00:14:32.240 programmers have to Define them because it is a fairly manual process and in variant if you haven't heard of it
00:14:37.680 before is basically an expression that holds true for every single iteration of loop so for every Loop X is always going
00:14:44.120 to be two for that for that and that's an invar there's actually a lot of research
00:14:50.839 going on right now in terms of invari INF looping in in extended stag checking
00:14:56.000 specifically because it's very difficult to write your own variant uh for a loop it's not it's not
00:15:02.320 foral and uh that is how we do loops in in short um the last step that we have
00:15:10.519 to handle is our is what I call Preamble which is where we Define
00:15:15.880 specification for built-in methods so by default Ruby does not have specifications built in for you know Fix
00:15:21.720 N plus and uh and all the other operators that we would need so we we
00:15:27.600 have to manually Define those applications there's there's good pressent in here we can use Ruby spec
00:15:32.720 and we can use tests and stuff to build those but it's still a manual process so it takes quite a bit of
00:15:38.560 fun so that's um that's mostly an overview of how we do R bogie
00:15:44.040 translation to summarize we map control flow we map the object type system we
00:15:49.920 map method dispatch to procedure calls and uh do static analysis for that we
00:15:55.000 handle the lambas to turn them into anonymous methods and um as well as
00:16:00.720 loops and then we have to create the Preamble and create a base set of uh
00:16:06.240 specifications for the buildings so what are some issues and limitations of Ruby
00:16:12.480 ESC working backwards the standard library is not fully covered as I mentioned it's a
00:16:17.959 manual process it takes a long time and a lot of effort to implement specifications for all the methods
00:16:23.079 inside of rby um we currently focus on integers and arrays mostly because
00:16:28.360 that's the kind of stuff we're doing the group concept so we don't really cover strings that well but we could
00:16:33.959 theoretically cover them methods with multiple type
00:16:39.519 signatures so you can have a method that takes an array sometimes in the spring
00:16:44.880 some other times in Ruby ESC in our current implementation we don't actually support that there are ways to work
00:16:51.000 around this though one of them would be this is actually a typing problem this is really just a typing problem with
00:16:56.079 better type inference we could actually solve this problem quite easily uh the other way we can handle this is by having separate overloads and if you do
00:17:02.560 document your methods with multiple overloads and have a specification for each one that's another way to do
00:17:10.160 it then there's eval so eval is really Dynamic stuff that's happening at one
00:17:16.480 time that we cannot figure out statically there's pretty much no way to
00:17:22.120 go around this except for annotations um so we're pretty much stuck on that front it's obviously not
00:17:29.480 ideal but pretty much uh if you have if you have some Dynamic C running using eval you would have to have some kind of
00:17:35.400 annotation that specifies what this eval does fortunately we try to keep eval
00:17:40.880 light so it's easy to find those hot spots and and you know those out if you
00:17:46.200 really need the es got the Apple icon here but I think the bigger problem here
00:17:51.720 is that you can get the your entire C base needs contracts for this and that's a real like have those because of
00:17:58.559 because contracts are not that easy to write we saw the Fibonacci sequence that was easy but that was also an easy
00:18:04.440 method um for a a 10line method that does a bunch of different logical steps and integrates with different systems it
00:18:10.720 become way more difficult and this is generally an ESC problem it's not a problem with our tool necessarily but
00:18:16.679 it's it's mostly an extan stag checking problem the fact that you really get an All or Nothing uh Behavior you don't if
00:18:22.799 you're missing one contract in your entire uh in your entire program your
00:18:27.840 entire program will not be able to be tested so it's not perfect uh Ruby is
00:18:35.000 not perfect but fortunately we have symbolic
00:18:40.080 execution symbolic execution is not perfect either but it does give us some benefits that Ruby DSC doesn't give us
00:18:47.400 the biggest one is that it does not require contracts at all um it can use
00:18:52.520 them but it does not require it that means that if you have a program
00:18:57.559 and you're missing one contract somewhere you will still be able to test your your your program so you don't have
00:19:02.640 that All or Nothing problem that rby that ESC does so let's run through an example of
00:19:10.039 what symbolic execution might actually look like if we were to actually execute
00:19:15.200 our code so just to summarize symbolic execution is really just what it sounds like where we are executing a block of
00:19:22.360 Ruby code symbolically so instead of concrete stat scalar values we substitute everything with a symbolic
00:19:29.480 value and we um we resolve those values after we've executed our
00:19:35.080 code so if we were to run through this method we would start with Y = 5 5x so
00:19:40.880 we assign 5 * X to Y so that would turn Y into 5x we then Branch so when
00:19:47.440 symbolic execution hits a branch it takes both branches automatically and resolves separate uh symbolic value for
00:19:54.080 each branch as a separate state so we split off our Branch into one state and
00:19:59.240 we now have 5x + 2 we then split off our Branch into a second state and we have 5x / X I missed
00:20:08.480 the assignment there on on the left side you that um so we actually end up with
00:20:14.120 two states now at the end and we returned both of those States so we now have two outut
00:20:19.880 states the end result of symbolic execution is basically a set of states for each code path for each ranch and
00:20:25.559 for each loot we have a set of different states each state has a set of logical formulas
00:20:31.200 like Y = 5x or Y = 5xx we then can run these logical
00:20:38.440 formulas through a theor to discover when they are unsatisfiable and our logical theorem
00:20:44.440 our theorem would actually be able to figure out for instance for 5x / X when X is Zer we know that that is
00:20:51.039 unsatisfiable and our the is able to tell us that so how do this really different
00:20:57.840 from ESC where converting them both into logical formulas and passing them to GE improver just doesn't really seem all
00:21:03.679 that different the real difference here is that ESC requires us to know what our
00:21:09.240 result will be beforeand we need to it basically assume it basically verifies that our assertions about our code are
00:21:16.520 correct the difference is that symbolic execution allows us to discover what our
00:21:21.760 result will be not having known what the specification
00:21:27.120 is so how's related to automated testing and more importantly how is symbolic
00:21:33.600 execution different from unit testing why would I just unit test so symbolic execution and unit
00:21:40.480 testing are a little different the difference is that unit testing really only confirms what we've asked it to check if we say you know check Fibonacci
00:21:48.080 when n when n is two and you should get whatever the result is you're only
00:21:53.400 checking for that value symbolic execution actually doesn't use any symbolic concrete values on the first
00:22:00.159 path so it can really run through and check all the values it can resolve a bunch of values and figure out either
00:22:06.039 using puristic or just logic to figure out there are certain values that this function will not work
00:22:12.799 on in other words it's automatically generating test cases for us so it's
00:22:18.720 able to figure out scenarios where our program will succeed or fail so now it's time for another demo
00:22:25.480 this time I'll Ruby Cas gen and I'll drop back into the terminal here and this time we will run the
00:22:32.840 Fibonacci sequ Fibonacci Sequence through the case generation and we will see what it
00:22:44.799 show it's going to take a bit of time but when it comes back it should give us a set of tests so it generated a bunch
00:22:52.080 of tests for us let me just so we have a bunch of tests
00:22:59.159 starting from Nal 0 Nal 2 you notice that it skipped Nal 1 because it was able to detect that n is n one and two
00:23:06.159 are the same case it then went to three and four it went it would have kept going to
00:23:12.679 5 6 7 8 um but the loop bound the loop Bound in the symbolic execution engine
00:23:17.919 was exhausted so the the symbolic execution engine will basically keep trying until it runs out of Stack space
00:23:23.960 or you can you can Define how long it's going to check how how many Loop iterations it could check for or you
00:23:29.120 know how how long it's going to how long it's going to recurse but in this case we have a recursion step of about three
00:23:35.600 or four so it stops at Four um and it's able to generate these test cases for
00:23:42.000 us um so we can actually run these test
00:23:59.000 C that you can see that our test pass so
00:24:07.000 that's Ruby case gen so how does Ruby case gen work well
00:24:13.600 we actually translate type annotative Ruby into Myra and some of you have
00:24:18.799 heard of Myra before but I'll talk about it in a bit we then use KSN as our symbolic
00:24:24.360 execution engine KSN uh will also po it in just a second
00:24:29.720 our symbolic execution basically as I just described will resolve all the values in our Sy in our symbolic all of
00:24:36.000 our symbolic values into concrete values depending on puristic as well as the case as well as whether or not our the
00:24:43.320 theorem logic actually says that there's a case here that will fail or or or might passs in a different way than the
00:24:50.320 others so we end up with test cases generated as data and then the only thing we have left to do is really
00:24:55.360 convert them into executable un test
00:25:01.919 so Myra is a statically typed Ruby light language it was created by Charlie
00:25:07.440 Nutter who is the J Ruby guy everybody knows him here um it basically compiles
00:25:14.480 Ruby into straight Java code so we have it converts it converts
00:25:22.360 the difference the only difference between Ruby and Myra is really the fact that there was type annotations on
00:25:27.760 methods everything else is done with type imprints so we actually do get type in prints for Ruby Cas
00:25:34.240 SP um so so that's myun PSN is part of the
00:25:41.159 serum framework and Serum is just a a framework for verification using
00:25:46.679 Java and uh it's created by the Santos Lab at Kansas State University and it's open
00:25:51.919 source and that's the symbolic execution engine that we're using so you might be
00:25:58.039 thinking now we're testing Java code wait what um the answer is only kind of
00:26:04.480 we're sort of testing Java code at the end of the day we're still writing Ruby code at the top I mean for Ruby ESC we
00:26:11.600 convert we convert Ruby into Boogie code so we're not really testing code but we
00:26:17.240 use Java code in this in Ruby Cas gen to generate our static
00:26:25.440 code so how does KSN know which values to you that's another question that that
00:26:30.720 might be useful so here's another sunrise sunset
00:26:36.360 question this one you this time you get a better hint that's East so now it should be fairly obvious what what it
00:26:43.880 is so as I mentioned it's all about context right if we provide type annotations so
00:26:52.600 KSN it knows what types operate on that's step one we can then provide optional contracts to tell KSM exactly
00:26:59.600 what boundaries we want to test in so we have extra context for KSM to actually
00:27:04.840 figure out you know should I test negative one on a fonop sequence does that even make
00:27:11.200 sense so we can specify optional constraints for instance if we had a progress bar that really only makes
00:27:16.399 sense between 0 and 100 we can specify these preconditions and tell kosan you
00:27:21.799 know don't test this between below zero or above 100 it's not going to make sense
00:27:31.159 but I still didn't answer the real question here that you all came for could a machine ever write test for our
00:27:37.600 code the answer is kind of simple we just saw it yes so we just saw R case
00:27:43.360 gen generating test cases for a code the only problem is this the wrong question right the better question is will
00:27:50.080 automatic test case generation always work for us should we just adopt this feature and start writing you know using symbolic execution all the time in our
00:27:56.600 tests and our rails code the answer for that is not always right
00:28:02.480 there are some scenarios where we just cannot handle certain features of Ruby where they're way too Dynamic for us and
00:28:10.360 so that brings us into limitations so one of the limitations of
00:28:15.919 Ruby Ken is that we do convert Ruby into using my we can therefore only support what
00:28:23.320 Myra supports and Myra does not have full coverage Ruby it's only a ruby like language it's not a ruby
00:28:30.120 language Java also doesn't work exactly like Ruby um integer values don't work the same way um there are other
00:28:37.039 differences minor differences fortunately for us most of the differences are fairly minor it's actually surprising how close they match
00:28:44.519 um the other issue and the bigger issue is that we still don't properly handle EV valid code and we still have to rely
00:28:51.080 on annotations for this kind of stuff annotations are good but it's not
00:28:56.200 really how we want to have to write Ry code and at the end of the day there are things that we just will need to
00:29:01.279 annotate that are way too Dynamic for a static analysis tool to
00:29:08.120 handle so how can we improve a ruby Cas version two if we were to do
00:29:16.440 that there are a couple things we can do to make Ras more effective the first
00:29:23.480 thing we can do is teach KSM to symbolically execute Ruby code
00:29:28.880 So currently PSN really only understands Java and that's sort of why we took the shortcut through Myra if we had access to VM instructions
00:29:37.240 in Ruby 2 or currently uh The ruby9 Interpreter doesn't really give you
00:29:42.279 access to the B code or the VM at at a very low level if we had access to that kind of
00:29:48.279 stuff it would be much easier to write in to write a symbolic executor because really what you're doing in a symbolic
00:29:53.760 execution engine is effectively inter uh running the B code at of level that's really where we want to be so if we had
00:30:01.360 that kind of access it would definitely make it easier for us to write a KS on interpret of
00:30:07.279 that the other thing we could do is Implement Ruby abstractions in Myra so
00:30:12.440 instead of making make instead of making ksan understand Ruby we can make Myra
00:30:18.120 understand Ruby and currently as I mentioned Myra really just compiles your Ruby down Java code so if you're using a
00:30:24.279 string in Ruby you're going to get a string in java. line string native string type we can build abstractions on
00:30:31.960 top of that and pull them in into Myra and have Myra compile us compile us a program that uses Ruby abstractions and
00:30:38.960 Ruby standard library and at that point we can actually have better matching to Ruby and uh Ruby's
00:30:46.960 features we can also forget about kson entirely and write our own symbolic execution engine there have there has
00:30:53.039 been some work done in static analysis with Ruby in terms of writing Med
00:30:58.360 representation languages so laser is a good one that good example of that but
00:31:03.600 writing a symbolic execution engine is Pretty Tough so that's that's a really
00:31:08.639 tough path to go down finally there's lvm CLE C is a
00:31:15.039 another symbolic execution engine that runs on top of lvm B code it interprets lvm B code and executes that so we have
00:31:24.320 an LM implementation of Ruby it's called rubinius and and actually reinius would be a very good place to try out this
00:31:30.799 kind of thing I these are the ideas that I came up with as as future steps for the for
00:31:38.000 the project I'm open to ideas I I'm sure I didn't think of everything if you just
00:31:43.919 thought of something now if you've been thinking about this for a while um please come up to me we can talk I'd be
00:31:50.080 interested to hear about how we do do things better for that there's some more details here if
00:31:56.600 you really want to know about exact how uh we translate Ruby into Boogie and what what are all the real technical
00:32:02.600 issues we deal deal with in terms of lambas and looping all those crazy things you can actually read I have two
00:32:08.799 papers published on that subject so you can grab those I think they're on Google schol both of
00:32:14.440 them and with that I want to thank you
00:32:26.519 guys any
00:32:32.159 so um how would you see this fitting into say uh web develop as
00:32:39.399 as so ideally this would be something where you I mean it would be really no
00:32:45.600 difference from integrating with a non-web developer workflow the only difference is there are high level
00:32:51.440 higher level abstractions to deal with and that your symbolic execution be able to handle it's more of an implementation
00:32:58.279 issue of how how long will it take us to get to the point where we can actually
00:33:03.880 symbolically execute something like rails and that's sort of where it stands so at the end of the day it's really the
00:33:10.000 same kind of workflow as non Ruby as non as regular Ruby code but it it probably
00:33:15.760 will be a while before you actually see execution working on Rails it is it is
00:33:21.840 pretty heavy step um would it ever be the case that
00:33:27.159 you would want to just run execution just across your full Lab and not have
00:33:32.600 it translate to test that you run separately that ex those test yeah so
00:33:37.880 one of the things we explicitly did was to trans translate those test cases that were actually just really award XML
00:33:44.480 documents Ruby code we could have just converted them into like a easier readable data format
00:33:50.360 like based on our pictures uh but we chose to go through to Ruby because you know just it looks
00:33:56.480 easier to read as as really executable coding there's less um there's less uh
00:34:01.919 test framework stuff that you need to do to code up the picture um so we just instead of having a two-part test
00:34:08.119 framework test framework and a tool that generates the data for the test framework we just generated the test
00:34:13.399 directly but that's really just a side effect of the fact really improve
00:34:18.839 concept yeah have you using
00:34:24.079 the I I've only I I heard MV last year I hadn't had a chance to take a look at it
00:34:31.000 um I would be interested that would be actually a good thing to look at so the answer is no I haven't really had a
00:34:37.520 chance to take a look at that but actually that that is one of that is a possible
00:34:45.000 yeah was sorry was why did you choose to go with first
00:34:54.040 in right so the question was why did uh we go to Myra instead of uh instead of
00:35:00.400 virginius first so uh we were I was actually working with with serum as a
00:35:06.119 for re from other research issues so I mean we were we were actually using serum in general for for the project um
00:35:15.320 so that was sort of the the guiding principle that I knew more about the serum framework than I did about
00:35:20.920 lbn um I had seen demos on C and I'm not I'm not sure exactly how how it stand up
00:35:28.079 to to ksan ksan does have specific optimizations that that makes extion
00:35:33.160 like way more possible and I don't know if it has those things so those those
00:35:38.560 are things we have to investigate um that was really the bigger issue I
00:35:44.280 think we mostly had a Java stack behind us so it was was easier to
00:35:50.079 find question all right well thank you very much guys
Explore all talks recorded at RubyConf 2012
+46