Interface Testing Creating a Diablo 3 bot using JRuby and Sikuli UI


Summarized using AI

Interface Testing Creating a Diablo 3 bot using JRuby and Sikuli UI

Rodrigo Franco • November 01, 2012 • Denver, Colorado • Talk

In his talk at RubyConf 2012, Rodrigo Franco introduces the concept of automated interface testing through the creation of a bot for the online RPG Diablo 3. He begins by discussing the challenges of interface testing, emphasizing that it is often a manual process, and outlines his exploration of different tools to automate these tests. Franco specifically finds the Securely framework useful for automation, as it allows interaction with user interface elements based on image recognition.

The key points of the talk include:

  • Introduction to Interface Testing: Franco briefly explains the nature of interface testing and the typical methods, such as unit testing and integration testing, used in the Ruby community.
  • Utilization of JRuby and Sikuli: He describes how he combines JRuby with the Securely library to work around his dislike for Java while automating the game’s interface interactions.
  • Diablo 3 Bot Creation Process: Franco walks through specific steps to create a bot that runs in the game:
    • Initializing the bot and defining key functions.
    • Clicking fixed points in the game.
    • Automating character movements and interactions, like entering and navigating through specific locations (e.g., finding the cellar).
    • Logging the bot's actions and performance to track the effectiveness of the runs.
  • Gameplay Mechanics Implemented: He discusses key game concepts like "Diablo runs" and character mechanics that the bot mimics, such as attacking enemies and collecting gold.
  • Cautions Against Bot Use: Franco warns about the potential for account bans if players use the bot dishonestly, emphasizing the need for ethical considerations in automation.

Franco concludes by reflecting on his experience with interface testing and encourages attendees to experiment with automation, while also cautioning against unwanted consequences such as account bans or disruptions in gameplay. The practical approach combined with amusing anecdotes makes the presentation engaging and informative, showcasing the fun in tackling technical challenges.

Interface Testing Creating a Diablo 3 bot using JRuby and Sikuli UI
Rodrigo Franco • Denver, Colorado • Talk

Date: November 01, 2012
Published: March 19, 2013
Announced: unknown

Interface testing is boring. Making sure an image is always there when the page was loaded and also ensuring its positioned in the right place is really boring. A lot of bugs are not found because we don't have an easy way to find this little devils.

In this talk, Rodrigo Franco will guide you through the creation of a custom tailored tool, created to solve an specific interface problem that thwarts millons of people in the entire world - find out the most efficient way to fill your pockets with virtual gold in the online RPG Diablo 3.

RubyConf 2012

00:01:10.159 so this gameplay you guys saw in this video it was not a person
00:01:16.240 playing it was a bot made in jay ruby and over my talk i'm gonna tell you how
00:01:22.960 i created that and guide all of you to the concepts i use it
00:01:29.119 this talk real name is automated interface testing it's kind of fancy but the real subject
00:01:35.040 is diablo 3 so don't worry and my name is rodrigo i'm also known as
00:01:41.040 cafo and the first thing i want to tell you is that i apologize that my english
00:01:46.479 sucks the reason that my english sucks is that i learned it by myself in brazil with no
00:01:53.040 formal training so if you find something hard to understand just shout and i will try to make my
00:01:58.960 best to be understand okay i work for living social i work remotely from brazil with a couple other
00:02:06.079 developers and i live in the sunny city of rio de
00:02:11.120 janeiro like our amazing friend tenderlove i have three cats
00:02:17.840 this one is this one in the bottom she peas and everything so she's in the bottom
00:02:25.120 and so just for you guys know in brazil we don't speak spanish we speak brazilian
00:02:30.560 portuguese it's a bit different just a bit brazil was also known for his amazing
00:02:36.879 coffee we have tons of really good coffee and also capybaras i took this picture
00:02:43.040 in the front of my condo they were just there like walking by and people was just like oh my god they are so nice
00:02:49.519 there are some small ones too but i was too afraid of the cafe bar kill me so
00:02:55.519 and in brazil too a couple a couple peop a small group of people in brazil eat
00:03:02.080 pizza with ketchup but i don't know that and i have done a couple talks in my
00:03:09.120 past most of them in portuguese and i i know myself i usually talk really fast
00:03:15.599 because i'm nervous so i added a couple slides to this presentation that they are not for you
00:03:21.120 they are for me they are to help me do things a bit slow and when you see these lies
00:03:28.080 through the presentation they are for me just for me to slow down a bit
00:03:33.440 okay all the announcements made let's start with what's an interface test
00:03:39.280 okay and looking on wikipedia we can see that interface testing is the process of
00:03:45.440 testing a product's interface to ensure it meets the writing specifications it's kind of fancy
00:03:51.760 and in our ruby world we usually do two kinds of different tasks
00:03:57.760 mostly unit testing and integration testing and we have a myriad of tools that we use to
00:04:05.040 make that like test unit rspec cucumber but in the end of
00:04:11.040 the day interfaces are really complex so we have a couple attempts to
00:04:18.320 help us to test interfaces like capybara it helps you
00:04:23.840 make shiny things with forms or selenium where you can drive your computer
00:04:29.919 to do the test for you but interface test is mostly manual
00:04:35.360 because you need to be able to see the interface to interact with it
00:04:40.400 to be able to perform a good test and i tried my best to find good tools
00:04:46.960 to find a better solution to help me automate these tests and i think the best one after a long
00:04:54.720 time trying to find it was securely securely is i think it's an mit project
00:05:01.919 and it's very powerful and very different from what we have right now in
00:05:07.120 our ruby ward for example that's the classic osx
00:05:12.240 trash can if you want to clean clean up the trash what do you need to do you need to click on the trash can
00:05:19.120 get to this fancy trashcan window click on empty again and confirm
00:05:25.120 how would you do that today with waita ruby script applescript
00:05:31.600 what i have gonna use to do that in the end of the day you just want to empty the can it's the only thing you
00:05:38.400 want to do and what you see up these amazing catchphrase
00:05:43.600 is a secular script where you can cut a screenshot of where you want to click
00:05:51.199 and just pass it as an argument to the click method so the good thing about securing is that
00:05:59.680 it can see security can see what's in your screen and it can interact with that for example
00:06:07.360 in my job i use it securely to automate lots of mobile testing
00:06:13.120 through the emulators so we have dozens of different emulators and you can just script like click on this button type
00:06:20.479 the login and if you see something different than that go to the next step
00:06:26.479 i also use it securely to on facebook games my fault and also i made a security project where
00:06:34.400 i had a webcam on my sofa so when my cat wanted to get
00:06:39.759 to it to pee on it it would react to the presence of cat and sound alarm
00:06:47.759 worked well and uh do not not get mad at her she's really cute as you can see
00:06:52.880 there with couple stuffed animals on top of it she don't really really care about what you put on top of her
00:07:00.479 she's just like lay down there as you can see my cats are really tame like we wanted to make a french version
00:07:06.160 of my cat but looks like jamaican and that's like a usual day of work in
00:07:11.840 my desk yeah sorry let's get back to the topic
00:07:16.960 so the point is to use the security as a script interface not using the
00:07:25.280 the language that created by it i would need to use java and i have done my share of java in
00:07:31.599 the past i don't want to get back to it so i tried to find a way to incorporate
00:07:37.840 it to my ruby scripts and the path to do it was jruby
00:07:44.720 so the real title of this talk is creating a diablo 3 buff and pawning your friends by
00:07:51.599 showing off your superior intellect and amazing code skills so and i'm going to show you how to
00:07:57.759 accomplish that with cqle and jruby the important one important thing is
00:08:04.560 don't do with your account what i'm saying to do that you can get banned
00:08:10.160 so you can lose your precious characters and the pressures loot but uh yeah i haven't i never done a bot
00:08:17.919 in my before all i'm saying here i never tested you never saw the video so
00:08:24.319 we can talk about it uh i'm gonna get a bit into uh some concepts uh about the
00:08:32.320 how diablo games works and one of that is like a diablo run
00:08:38.560 what's a diablo run is when you get a specific mission a specific quest in the game and uh you keep doing it over and
00:08:45.600 over getting out the loot they are usually like small uh maps that you just go through it killing all the
00:08:52.000 mobs all the monsters getting the loots the items and the gold that they drop in the floor and you restart
00:08:59.200 one of the most famous runs for uh for bots is the seller run because it's a
00:09:05.360 really simple uh scenario that the bot needs to navigate
00:09:10.560 and you are more interested in this guy sarkoff that's in the end of the run so you kill him there's a lot of gold in
00:09:17.279 the floor you get the gold and you're happy so so to make it easier to understand uh
00:09:24.080 i made a couple draws here so let's say you have a hero
00:09:30.160 and the hero needs to get a specific point and that's in that point there's a
00:09:35.360 cellar door and the cellar door can be opened or can be closed and your bot needs to
00:09:40.880 know how to deal with that when you enter these
00:09:47.040 this path you have a corridor and a big room in the big room there's a huge monster
00:09:53.519 you need to go through it enter the big room kill the monster
00:09:58.800 and get the gold from the floor thank you so
00:10:04.399 in sum you have a hero okay
00:10:09.519 you kill a monster you get the gold from the floor and you raise it
00:10:15.839 sorry it's exit not say cyrus axis in portuguese and you keep doing that
00:10:21.920 over and over really cool so what's in
00:10:27.200 what's in for today i'm gonna guide you through these nine steps like initializing the bot
00:10:33.920 clicking on fixed points how to move the character how to find the seller we start the run when something's gone
00:10:40.240 wrong creating the logs entering the cellar attacking the demons and getting the
00:10:45.279 gold so okay let's start with initializing uh
00:10:50.640 the bot uh i'm not gonna put the entire script here i'm just gonna show parts of it and
00:10:57.200 by in the end i'm gonna give you guys the entire script so but i i'm going to use you guide you
00:11:03.440 guys through all of it so regard ruby gems require java classic stuff
00:11:08.880 for uh jrubyscript and we require java so we can add
00:11:14.800 the sql cl libraries to the script
00:11:20.480 you can download the install package on cqleat.org and i moved the application
00:11:26.480 to my home director to make it easier to deal with that otherwise you need to get the complete
00:11:31.760 path from your application and it's kind of lame and from there you require the jar file
00:11:40.560 and import all these magic classes to your script uh each of these classes
00:11:46.560 deals deal with a part of the security environment like a version of your
00:11:51.600 screen or your entire screen settings where you can configure stuff and the events is where you act on top of the
00:11:59.040 screen or the region and the script by itself is the main unit that hold everything
00:12:06.480 and then you include java so all these classes can be added to your script
00:12:12.240 cool besides that that's the basic diablo bot class it's really simple it's
00:12:20.880 i just initialize a screen object a security object
00:12:26.000 and i have this image path helper that will make it easier for us to find
00:12:32.079 the images on the go and that's the basic the basic bots is
00:12:39.279 the basic thing you can get but the first thing the bot needs to do
00:12:44.639 is like click on a fixed point because when you click somewhere the character gonna walk there and try
00:12:52.240 to and be guided by your mouse click so the first way to do it is like
00:12:58.240 you can use the screen object there and just pass a location and this location is uh
00:13:05.600 x and i grid when on zero is the top of the screen the top
00:13:10.639 left and you can just find out where you want to click by passing these
00:13:15.920 fancy numbers yeah and so our script the first version of our script we just
00:13:22.240 like switch to the diablo 3 app and click on a specific point that's going to make the bot runs
00:13:29.040 and we can add these two instructions to the seller run method that will be our
00:13:34.320 main method to make the bot works so let's see how that would run
00:13:45.519 so it's really simple it's just switch diablo and click on a specific point
00:13:51.600 obviously this point is specific to this screen resolution so if you want to run the bot in a different resolution we
00:13:57.839 need to point to a different point but there is no magic there
00:14:04.240 and from that we need to make the bot continue walking towards the cellar and to do
00:14:10.320 that we can use auto movement on diablo you can define a key that when
00:14:16.000 you press it and the character will start walking to the cursor position wherever it is and
00:14:23.279 when you press it again it will stop and using that i define the x key for it
00:14:29.760 and created this method basically to press the x key down
00:14:36.399 uh wait for the specific delay defaults half second and
00:14:43.440 release the key so it's really simple with that i was able to make also a
00:14:49.120 multi-movement method really simple package
00:14:54.240 lame and with that you can say like how many
00:14:59.440 how many times you want to to move the bot so we can do stuff like that and uh say
00:15:06.079 like after you click on this point keep moving by seven times that what we would get
00:15:16.079 it's gonna click and gonna continue walking for seven times
00:15:22.800 and as you can see there there's already like a closed cellar door there so it's really simple
00:15:30.160 thank you thank you cat so as you guys can see here
00:15:36.560 i'm playing again the same movie we need to get a bit further down the
00:15:41.600 screen to make it close to the seller
00:15:47.199 you see we are kind of far from it so what you're going to do to leave it in a
00:15:52.320 better spot is click on the bottom of the screen and move it three times so we just add that
00:15:58.639 to the celebrant method and that's what happened
00:16:10.560 yeah you can see the sellers open at that so it's a different situation but you are in a way better spot
00:16:16.560 to deal with the seller detection really cool
00:16:22.880 now i'm going to talk about how we can find the seller and make yourself into it
00:16:29.839 uh i have this method here seller open and it's basically checked the screens
00:16:35.600 for that specific image if cqd can find something that looks like the open door
00:16:43.199 of the seller it's going to return true otherwise it's going to return false i also created another helper method
00:16:50.480 called sellers closed and it's basically the same thing it tries to find the wooden the clothes
00:16:56.959 that you wouldn't or with that i can have this really simple uh logic here if seller is opening i
00:17:04.079 call a method called enter seller otherwise i reset the run
00:17:09.360 and it's important to say that these two images are basically
00:17:14.400 basically screenshots of the game they are not special by any means i just cut
00:17:19.600 them up from uh a game run save them as jpegs or pngs or
00:17:25.839 whatever in the script directory and secure can compare this current screen
00:17:32.720 with these images so it's really simple and it's also very powerful because it can
00:17:39.360 find some deviation on the images and even with that
00:17:44.880 recognize stuff really well so we now need to create where is it
00:17:53.039 okay we now need to create two different methods one for the character enter and the seller and the other one to restart
00:18:00.000 the run so let's see how we would restart the run
00:18:05.919 and i'm going to show you guys how i would a manual execution of a restart
00:18:13.039 you press ask click on edit wait 10 seconds
00:18:18.160 it's going to take a bit more seconds
00:18:25.919 not fun yeah so and then you get back to the login screen
00:18:31.200 enter the game again wait for the load and you get back so basically you have to
00:18:38.080 press ask click at leave game wait for a beat click at rhythm resume game and wait a
00:18:44.880 bit more and how can we accomplish that using
00:18:50.480 securing let's see we can press the s key click on a
00:18:57.120 coordinate just to make it different otherwise you can just like cut this image like i done here and click on it
00:19:04.240 wait for 30 seconds a bit more than 10 because your computer may have some
00:19:09.360 latency click on resume wait a bit this other value that you need to wait is based on
00:19:16.000 the computer configuration and we can invoke the seller run again
00:19:21.760 based on what we saw until now there's three different methods that we never
00:19:27.039 implemented so let's go to it and see what we can get
00:19:34.480 we can create this press method is basically what we do for the movement but we pass
00:19:40.480 a key option and we can then
00:19:45.520 pass the ask key constant like
00:19:50.640 that is provide for us really simple
00:19:55.679 and the screen click method also accepts some location allocation instead of an
00:20:02.080 image so we can just pass a specific location and it would move the mouse to
00:20:07.120 this position click on it and that's it
00:20:12.880 also have this fancy click that was we always expect a pattern like
00:20:19.200 an image and it will wait it will run synchronously so it will
00:20:25.679 wait for the image to appear on the screen and when
00:20:31.120 if it finds it it's gonna click click on it and passing zero in the end because i
00:20:36.640 don't i don't want to i don't care if the it it's written something i just
00:20:41.919 want to make the click happen if it works and so it's really simple uh so with that i'm we we have a
00:20:50.320 a really simple bot that walk walks to the right path and can reset the run
00:20:55.440 uh after that i found like i need to find a way to log the execution of my script like how many is run how many
00:21:02.559 runs i have made how many how many times have we started the runs and uh i think the easiest way to do it
00:21:09.840 was using openstruct it's really simple it's powerful and it's on the basic language so
00:21:18.240 i used it on the script basically i have this fancy object here when i log
00:21:25.360 the runs the number of times of the sellers close it the seller is opening or when i can't find
00:21:31.440 the seller by any reason and i added that to the seller open
00:21:37.919 if method so if the sellers open it i had one to open it and then to the
00:21:43.120 seller otherwise i say i had one to close it and restart the run
00:21:48.320 and if i can't find a seller by any way it's not opened it's not closed something's wrong
00:21:53.760 i add one to not found and restart the run and
00:21:59.520 that's an example of the open structure object of 220 runs i got 90 closed
00:22:07.679 sellers and some open
00:22:13.360 not much i never made that
00:22:19.440 it's it's all from my mind so let's say that in my mind i have a seller and i need to enter on it now
00:22:26.000 right okay so when you enter the seller you get into
00:22:32.000 this specific screen and i need to find a way to know that my
00:22:38.240 character is inside the seller how can i do that
00:22:43.679 and so the first thing is i need to create a method to enter the seller and this
00:22:49.280 method is really simple i click on a coordinate to enter the
00:22:54.960 seller when i'm down there and i know the sellers open it and uh these coordinates it's basically the opened
00:23:01.840 seller and i wait for two seconds so the script can be so the seller can be loaded
00:23:08.480 after that i try to find on the screen this small lamp as you can see here
00:23:14.640 inside the seller have this lamp that's not available outside of it so if i can find it i'm inside the seller
00:23:22.159 yay otherwise i need to restart the run something wrong happened
00:23:27.280 so yeah that's the shiny lamp after that i need to click on the bottom of the
00:23:33.200 screen so i can move my character to the right position to attack the evil demon
00:23:39.440 sarkoff and how i do that i just calculate the position and wait for a bit all of that
00:23:46.559 depends on your resolution then you do need to do a lot of testing like where i'm going to click and uh
00:23:52.480 the size of the window so we need to make sure everything's perfect for your resolution and your bot it's it's not
00:23:59.360 like you're going to get my script and run and everything will be perfect do not and you can get burned don't do that
00:24:05.520 don't do it okay so yeah and you get into this specific position and you can click
00:24:13.520 on the mob until he's dead now we need to attack the demon how how
00:24:19.200 can we do that how can we attack the demon on diablo you have this option where you
00:24:24.480 press the shift key and since you are wrenched character you
00:24:29.679 keep attacking from there keep attacking all the time until you stop pressing the
00:24:34.720 mouse button and you release the shift key so basically what you need to do is you
00:24:40.240 need to position your character in the right place activated ranged attack
00:24:45.520 wait for enough time deactivate the attack and collect the money and how can we do that
00:24:52.400 basically all i showed you guys it's probably easy for you to understand
00:24:57.760 you just need to press shift and then you trigger the mouse button and keep
00:25:04.480 attacking you wait for a beat then release shift and release the mouse button it's really simple you just need
00:25:11.600 to calculate how many how much time your character will take to kill the monsters
00:25:16.799 and you need to use a specific skew that can trigger it can kill more than one
00:25:22.080 monster at the same time like and in the video i use a demon hunter a slingshot
00:25:27.360 that can take a lot of different monsters at the same time so if you just wait there you're gonna kill everyone
00:25:34.159 after that you collect the gold it's the only thing you need to do and how we can do that
00:25:40.159 i created this simple method that press the alt key and look the screen for the word gold
00:25:46.320 why i'm doing that and not clicking on the golden pieces it's because the golden places are really small and they
00:25:51.840 are shiny so even if you try to find a specific color it would be hard to get
00:25:57.120 all of them but if you press alt dial we're gonna show you all the uh all the
00:26:03.279 labels for all the items in the screen so we can basically select click out all the gold words are going to appear and
00:26:10.400 with that you can see oh we have gold on the screen what should i what should we do now we can make this crazy
00:26:17.520 simple thing that if there is any gold you click on the gold word and you get all the gold otherwise you keep doing it
00:26:24.720 no otherwise you restart the run but after you click on the gold you see if there's any gold again and keep doing
00:26:31.120 keep doing keep doing it really simple so uh
00:26:36.480 yeah basically that's it let's see how everything goes together
00:26:42.320 it's basically clicks on the point keep walking
00:26:48.159 click on the other point find the seller click on it
00:27:12.080 foreign
00:27:34.720 i told you that i i speak very quickly when i'm nervous so i'm basically done and uh you can make
00:27:42.400 the bot way better if you want but you're going to get banned so don't do it and you can make it click on purple
00:27:48.720 items that they are really fancy items you can just look for any bytes with the
00:27:54.480 specific purple of the title of the items you can do whatever you want uh
00:27:59.919 the script is here i'm gonna tweet the link later so just look for the rubyconf hashtag or follow
00:28:05.760 me and you guys gonna have a bit more time because i'm done thank you
00:28:15.520 and if there is any questions we have plenty of time you
00:28:21.520 sorry the i think the question is how close
00:28:26.720 the images needs to match security do a liv a really good devia they have a really good deviation algorithm so if
00:28:33.919 for example you get a facebook avatar and you resize it by the wii for two
00:28:39.279 thousand two uh two hundred percent it is still gonna find it it's really good but it's it's slow
00:28:47.840 so uh if the image is too different it's gonna take a while for it to process
00:28:53.120 any other questions go for it
00:29:00.640 yeah so the question is about my experience of using it to do real work not getting virtual gold
00:29:08.240 so yeah basically as i i said before uh we had a couple we had some apps that run a
00:29:14.720 different number of devices basically like all ios device and a lot of android devices
00:29:21.360 and uh it can it's hard to test that manually by yourself in the emulators
00:29:26.880 all the time we even don't you try to use like some services where you can sign up and you can test on virtually
00:29:34.159 any android device for real through a vnc connection and
00:29:39.279 cq is nice because all the scripts that i made to test on the emulators worked on this
00:29:46.080 device remote device later it's basically the same concept i
00:29:52.080 get the screenshots of the buttons and the taxi areas securely have a kind of
00:29:58.399 slow but works ocr ability so you can convert something
00:30:03.679 that's on an image on the screen to text and it's very powerful but
00:30:10.720 you you can basically follow what i have done here and create a script for to
00:30:16.080 automate whatever you want it's just slow it's not really fast that's why i made like eight times the
00:30:23.520 speed when getting the gold because it needs to check all the time for the gold text it's not that
00:30:29.760 it's not that quick i hope i answered
00:30:34.960 okay
00:30:40.240 yeah yeah to get a legendary or uh app kaiten what all my tests would have done because i i
00:30:47.039 checked for the the x of the color so you can get just one single one single pixel image with the
00:30:54.399 color and look for it and legendary is really good because there's almost nothing in the screen with the same
00:30:59.600 orange color anyone else oh god
00:31:18.159 but it's going to take the same time to find all of them what i have done in the past uh i
00:31:23.440 treaded it and put each one in a thread but the problem is like trying to find
00:31:29.279 the pixels on the screen in a game screen it's really hard because everything is changing it keeps like iterating there's no i tried my best to
00:31:37.440 make it better but it's it's not that simple
00:31:48.480 yeah because you don't have everything if like here we are using opengl or directx so it's way more complex to find
00:31:56.000 out but if you have for example the emulator the mobile emulator i use it it was like a really tiny part of the
00:32:01.840 screen it's way faster to find something there than like a huge game window so
00:32:07.200 it's it's a slow but it make the get the job done it's not it's not a real problem
00:32:16.720 exactly exactly like uh i think i show it like you can use all the entire screen and it will be freaking slow or
00:32:23.679 it can define like i just want this small tiny area but like when i made my cat's sofa alarm
00:32:31.039 i had a like a web a webcam hook it on uh look in the sofa like i'm gonna kill
00:32:36.640 the cats and uh i made the image really small so
00:32:41.919 it would be easier for you to see like the cat is going there alarm alarm
00:32:47.679 anyone else i'm really bad to see so if there is someone just shout and wave
00:32:55.519 i think we are good thank you guys so much
00:33:31.679 you
Explore all talks recorded at RubyConf 2012
+46