Powerful (but Easy) Data Visualization with the Graph Gem


Summarized using AI

Powerful (but Easy) Data Visualization with the Graph Gem

Aja Hammerly • September 29, 2011 • New Orleans, Louisiana • Talk

In her talk at RubyConf 2011 titled "Powerful (but Easy) Data Visualization with the Graph Gem", Aja Hammerly emphasizes the importance of effective data visualization in managing large datasets, particularly in the context of adaptive education software developed by DreamBox Learning.

Key Points:

  • Background: Hammerly introduces her role at DreamBox Learning, where they develop software for teaching children math. She explains the challenge of manually specifying dependencies between lessons, which often leads to errors in data.

  • Challenge of Data Representation: She illustrates the complexity of visualizing data through an example of an XML file representing lesson dependencies. The XML format can be difficult to interpret compared to visual graphs.

  • Introduction to Graph Gem: Hammerly presents the Graph Gem, a Ruby library that simplifies the visualization of data using the DOT language from Graphviz, allowing the creation of graphs that effectively illustrate relationships and dependencies.

  • Visualization Techniques:

    • Using shapes and colors to differentiate nodes and edges enhances readability. She discusses the use of Brewer color schemes to convey meaning effectively and ensure designs are visually appealing.
    • Clustering techniques are introduced to further simplify the representation of data and identify areas needing more content.
  • Automated Updates: Hammerly describes the implementation of automated updates for visual representations of data to ensure that stakeholders always have access to the latest information without manual intervention.

  • Real-world Applications: She shares various examples showing how these visualizations are used to improve curriculum design, track student progress, and facilitate discussions among educators and development teams.

  • Engagement: The talk also highlights how visuals created with the Graph Gem can engage educators more deeply, as they provide clear insights into curriculum structure and student pathways.

Conclusion:

In conclusion, Hammerly's presentation demonstrates how the Graph Gem allows for effective and efficient data visualization, aiding in the identification of relationships and dependencies in large datasets. By utilizing tools that automates and enhances visual data representation, teams can save time, reduce errors, and improve educational outcomes.

The session invites further exploration and encourages attendees to adopt these tools in their workflows for better decision-making and productivity in data handling.

Powerful (but Easy) Data Visualization with the Graph Gem
Aja Hammerly • New Orleans, Louisiana • Talk

Date: September 29, 2011
Published: December 12, 2011
Announced: unknown

Many projects involve large datasets. While humans are master pattern matchers, trying to find patterns and issues by looking at reams of text is hard. Sometimes you just need the 30,000 foot view but creating diagrams by hand is time consuming and prone to error. The graph gem makes it quick and easy to create visualizations of your data. In this talk, we'll learn how to use nokogiri and graph to get data out of an XML file and into a format where you can see relationships and patterns. We'll also see how color coding and clustering can emphasize the patterns in your data. Many of the examples will come from my work with adaptive education apps.

RubyConf 2011

00:00:17.199 this is easy data visualization with
00:00:19.039 graph um I'm ay Hammerly and I'm gonna
00:00:22.199 give a little background about myself
00:00:23.359 first so a little bit about me I'm from
00:00:25.279 Seattle and part of Seattle RV I work
00:00:27.119 for a company called DreamBox learning
00:00:28.840 and we build adaptive education software
00:00:30.880 to teach children
00:00:32.360 math we have about 500 lessons in our
00:00:35.480 curriculum right now and we're always
00:00:36.960 adding more and part of our the way our
00:00:39.640 development Cycle Works is we have
00:00:41.520 actual classroom teachers used to teach
00:00:44.000 in Elementary Schools kindergarten first
00:00:45.520 grade second grade and they build the
00:00:47.480 lessons and then they manually specify
00:00:49.199 the dependencies between the lessons and
00:00:51.280 so an example of a dependency would be
00:00:53.399 you have to learn to count before we can
00:00:54.960 teach you
00:00:56.440 multiplication and since we're manually
00:00:58.680 spec manually specifying those
00:01:00.519 dependencies there's a lot of bugs in
00:01:02.760 that data and we need to find those bugs
00:01:04.839 and find patterns in that data so that
00:01:06.320 we can optimize the recommendation
00:01:07.960 engine that the engineers build on top
00:01:09.960 of the lessons and the lesson dependency
00:01:13.360 data and so to show you how hard this
00:01:15.720 problem has been for us I want to give
00:01:17.320 an example so which of these is easier
00:01:18.840 to
00:01:19.920 comprehend this is an XML file it's part
00:01:23.360 of the XML file that we use to specify
00:01:25.280 the dependencies the real thing is about
00:01:26.799 2800 lines long we use XML because
00:01:29.960 because it's nicely CR crossplatform and
00:01:32.040 it was the easiest thing to build when
00:01:33.560 we started building
00:01:35.119 it or this this is another example and
00:01:38.680 this is that same data it's actually
00:01:40.040 more data than I showed on that XML
00:01:41.640 slide because the font would have been
00:01:43.920 too tiny if I included all of it and
00:01:46.079 this is the exact same thing represented
00:01:47.840 as a visualization as a graph and so
00:01:50.240 looking at this you can already see some
00:01:51.600 patterns on the right hand side there's
00:01:53.520 a really long sequence of things that
00:01:55.040 build one on top of another on the left
00:01:57.159 hand side there's some stuff that's
00:01:58.320 happening in parallel I've colored the
00:02:00.640 nodes that are leaves of the graph they
00:02:02.360 don't have any outgoing edges blue and
00:02:04.320 you can see that we've got three
00:02:06.159 different four different Leaf nodes and
00:02:07.799 maybe that's a bug maybe that's not but
00:02:10.679 really quickly you can get a general
00:02:12.160 idea of what's going on that just wasn't
00:02:14.519 obvious at all when you looked at the
00:02:15.760 XML
00:02:17.239 file and we figured this out this was
00:02:19.640 really obvious the academic staff
00:02:21.239 actually came up with this and they were
00:02:22.440 drawing graphs in Vio and using uh
00:02:24.920 Post-it notes and markers on chart paper
00:02:27.000 because they're teachers to do this
00:02:28.920 stuff very early on and making pictures
00:02:31.080 by hand is really easy anyone can do it
00:02:32.800 you can draw something on the back of a
00:02:34.040 napkin you can draw something on the
00:02:35.360 wall it's easy problem is is it doesn't
00:02:38.920 scale and here's why it doesn't scale
00:02:41.360 it's timec consuming making graphs and
00:02:43.159 making them usable readable making these
00:02:45.080 visualizations by hand is
00:02:46.800 hard it takes a lot of
00:02:49.319 time another problem we have is that our
00:02:51.400 underlying data is changing frequently
00:02:53.319 there's been a lot of talks this
00:02:54.400 conference about releasing often
00:02:56.480 updating your site often constantly
00:02:58.239 deploying and so if you're changing
00:02:59.800 Ching your underlying data frequently
00:03:01.680 you have to change your graphs
00:03:02.800 frequently and then you come back to
00:03:04.040 time consuming you don't have the time
00:03:05.599 to do that and the other problem we run
00:03:07.879 into is that different people want
00:03:09.239 different views of the data we've got a
00:03:11.120 customer service and a sales team who
00:03:12.680 want kind of a 300 300 foot level they
00:03:15.120 want to see it in a little bit of detail
00:03:17.040 but not a lot they want to be able to
00:03:18.319 answer general questions those of us
00:03:20.480 working on the recommendation engine
00:03:22.000 need to be able to validate the
00:03:23.440 recommendations are correct so we need
00:03:25.239 to see every single node we need to see
00:03:27.080 every single edge we can't zoom out that
00:03:29.840 L and so we're having to create multiple
00:03:31.519 views of the data and that adds even
00:03:33.319 more
00:03:34.200 time so I brought this problem to the
00:03:36.360 Seattle RV meeting and someone says well
00:03:37.920 why don't you use graphis I'm like what
00:03:39.959 is graphis so graphis is B is the dot
00:03:43.400 language which is part of graphis is a
00:03:45.080 simple language to describe graphs if
00:03:47.159 you haven't figured it out already I'm
00:03:48.480 talking about graphs as in graph Theory
00:03:50.640 not graphs as in pie charts and bar
00:03:52.319 graphs that you drew in elementary
00:03:53.519 school graphs are a collection of nodes
00:03:55.720 with edges and the nice thing about dot
00:03:58.400 is that you can edit various ATT rut of
00:04:00.159 your graph so that you can emphasize
00:04:01.560 some data or you can add layers of
00:04:03.079 meaning on top of just the nodes and
00:04:05.560 edges so here's a simple example uh
00:04:08.480 first line says diph example Digraph
00:04:10.640 means directed graph that means your
00:04:12.200 edges have arrows on them name is
00:04:14.519 example and there's an edge between A
00:04:16.239 and B there's an edge between B and C A
00:04:19.199 is going to be a box and B is red and
00:04:21.600 that's exactly what you get out picture
00:04:23.320 on the right shows exactly what renders
00:04:25.440 when you run that through the do
00:04:26.759 rendering
00:04:28.000 engine so by default default dot uses
00:04:30.880 dot files dot dot and I know have two
00:04:33.960 good tools to view those graphis is the
00:04:36.120 easiest it is available for Windows if
00:04:38.320 there's anyone in the audience using
00:04:39.880 Windows uh it's quick download works
00:04:42.199 great very very easy to understand and
00:04:44.479 use if you have really big graphs and I
00:04:47.000 learned about this last year at rubycon
00:04:49.160 hundreds thousands of nodes tulip is a
00:04:51.919 better tool it tends to choke Less on
00:04:54.479 big files it also has a lot of different
00:04:56.720 rendering engines to emphasize different
00:04:59.080 aspects of your graph so it can pull out
00:05:01.199 stuff that has no incoming edges a lot
00:05:03.960 easier than graph F
00:05:06.000 does but this is Ruby com so we're g to
00:05:08.520 use
00:05:09.400 Ruby pseudo gem install graph there's a
00:05:12.120 graph gem for Ruby that makes generating
00:05:13.919 dot files very easy let's look at what
00:05:16.680 it does so a simple graph simplest graph
00:05:19.080 I could come up with has a single node
00:05:21.479 Digraph do that's exactly like that
00:05:23.160 Digraph example line that we had in the
00:05:25.240 dot language node B draws a single node
00:05:29.240 with a lab of B couldn't be easier than
00:05:32.039 that a nice thing about dot that and
00:05:35.240 I've really enjoyed using this is that
00:05:37.080 nodes have both an ID or name and a
00:05:40.120 label and those are different so you can
00:05:41.880 use a short simple identifier in your
00:05:43.800 Ruby code and have something with more
00:05:45.720 meaning something longer show up on the
00:05:48.280 output so the people can understand so
00:05:51.160 if you want to add a label you have our
00:05:52.639 same node B label hello call the label
00:05:55.720 method give it a label of hello you have
00:05:57.880 a node with hello really really
00:06:00.680 simple a graph with just one node isn't
00:06:03.360 especially interesting so let's put in
00:06:04.800 an edge so we have an edge between a and
00:06:08.000 b and the order is important if it was b
00:06:10.400 and a the arrow would point the other
00:06:12.440 direction and that's the graph we get at
00:06:14.960 so one really important thing to note we
00:06:17.080 didn't declare any nodes here the nodes
00:06:19.039 are implied because of the fact that
00:06:20.560 there's an edge between them so you
00:06:23.160 don't even have to know what nodes you
00:06:24.560 have in your graph if you know what the
00:06:25.639 relationships are you can just build the
00:06:27.039 whole thing with
00:06:28.000 edges so this is all
00:06:30.080 great but I'm not showing you how to
00:06:32.240 save these or export them so let's
00:06:33.639 actually cover that quickly saving call
00:06:36.240 the save method give it the file name
00:06:37.919 this saves a do dot file it is that
00:06:41.400 simple dot files aren't especially
00:06:43.639 useful unless all of your co-workers
00:06:45.120 also have graph fizer toolup installed
00:06:46.880 so you can export it um save you pass it
00:06:50.400 a second parameter of the format you
00:06:51.800 want it to save in and it saves in that
00:06:53.759 format I have PNG and jpeg here um every
00:06:57.520 format you could possibly ever want is
00:06:59.759 probably available already if you go to
00:07:01.879 the graphviz website you can look there
00:07:03.599 are pro I know there are more than 20
00:07:05.560 there's probably close to 50 different
00:07:07.240 formats PDF Tiff several that I hadn't
00:07:10.720 heard of they're all there you can save
00:07:12.520 as them it works great so with just
00:07:15.960 these four slides five slides worth of
00:07:17.919 material you can build things like this
00:07:20.879 and this is the first half of our
00:07:22.879 academic curriculum drawn out you can
00:07:25.000 see that there's parallelism going on or
00:07:28.199 you can build things like this and this
00:07:29.919 is a deep dive into one of the ovals
00:07:32.000 from the previous slide you can do all
00:07:33.960 of that with edges and nodes and saving
00:07:36.360 it's really not
00:07:38.000 hard but that's boring and it doesn't
00:07:40.919 have a ton of meaning you can see some
00:07:42.199 of the relationships but let's see if we
00:07:43.840 can make it more interesting and pull
00:07:45.240 more meaning out of it so the first
00:07:47.479 thing is
00:07:49.080 shapes let's say you decide you want all
00:07:51.159 of your nodes to be
00:07:53.039 triangles use add triangle to the node
00:07:57.159 attributes uh keyword and you get
00:08:01.560 triangles there's also ovals and boxes
00:08:04.520 and circles with two lines in them and
00:08:06.560 all sorts of other great shapes all the
00:08:08.479 shapes that are built in dot come for
00:08:11.039 free a lot of people like to use boxes I
00:08:13.800 use them because I think they take up
00:08:15.120 less space vertically because you don't
00:08:16.960 have the curve of the oval above and
00:08:18.840 below a name especially when you have
00:08:20.599 long names and boxes are special with
00:08:23.360 the graph gem you just type the word
00:08:25.440 boxes automatically you get boxes it's
00:08:28.319 awesome
00:08:31.280 say you want to do some old school
00:08:32.680 flowchart I don't know how many people
00:08:34.120 have you had to draw flowcharts at some
00:08:35.719 point in your Computing education you
00:08:38.479 can do it you write have an edge here
00:08:41.440 and actually this is a great example
00:08:42.680 because you show that you can declare
00:08:43.919 multiple edges on a single line I have
00:08:46.120 Edge a b and c draws an edge from A to B
00:08:48.839 and an edge from B to C and then I paint
00:08:52.000 no a with triangle paint node B with
00:08:54.320 circle and I paint node C with diamond I
00:08:57.000 get those shapes out so maybe you want
00:08:59.519 all of your Leaf nodes to be painted
00:09:01.079 with stop signs they actually have an
00:09:03.040 octagon you could do that works
00:09:06.120 great and shapes are great I don't
00:09:08.920 actually use them that much because I
00:09:10.399 prefer to use color I think that human
00:09:12.519 eye is better track is better at
00:09:13.959 tracking color and seeing color quickly
00:09:15.800 than seeing different shapes especially
00:09:17.640 when the graphs I'm doing which you
00:09:19.600 can't even print on the standard printer
00:09:21.160 it Con go because they end up being so
00:09:23.200 big so first thing you can do is we'll
00:09:26.120 make all of our nodes Red by adding the
00:09:29.760 red the Red symbol to the note
00:09:32.200 attributes array all the edges
00:09:34.560 blue and then this is what you get a
00:09:37.480 graph with red nodes and blue
00:09:39.600 edges that's not especially interesting
00:09:42.000 I like to be a little more colorful so
00:09:43.800 let's make a rainbow so we're going to
00:09:45.920 say that our noes have the filled
00:09:48.040 attribute and then we're going to draw
00:09:49.760 an edge a bunch of edges and the nodes
00:09:51.800 that go with them and then we're going
00:09:52.720 to paint node G with green paint node o
00:09:55.000 with orange paint node r with red and
00:09:57.560 paint node p with purple that's what you
00:09:59.880 get at it really is that
00:10:02.480 easy so how can you use this so this is
00:10:05.480 that same graph I showed you just a
00:10:06.880 couple slides back only now I've painted
00:10:09.760 all the leaf nodes with blue and I've
00:10:11.920 painted all the orphan nodes that have
00:10:13.399 no go incoming or outgoing edges with
00:10:15.760 red and it's a lot easier to see where
00:10:18.040 those things are
00:10:19.760 now so I don't know about you I'm one of
00:10:22.519 the design impaired people I am lucky
00:10:24.360 that I work at a company where we have
00:10:25.839 designers who can take my very ugly
00:10:27.800 rails views and say no no do this make
00:10:30.600 the HTML look like this and here's some
00:10:32.360 CSS poof pretty it's great um but one of
00:10:36.399 the awesome things that I really like
00:10:37.760 about graph graph fiz and the graph gem
00:10:39.800 is that you can use color
00:10:41.720 schemes and to when I brought my first
00:10:44.560 examples of this into into work I was
00:10:46.240 working on them as a side project
00:10:47.560 they're like how' you pick those colors
00:10:48.920 and I'm like I didn't they came from one
00:10:51.320 of the color schemes it's great makes
00:10:53.040 you look really professional and you
00:10:54.639 don't have to know about pretty so
00:10:57.160 graphviz uses the Brewer color schemes
00:10:59.320 if you've never heard of the BL color
00:11:00.760 schemes somebody made a bunch of color
00:11:02.959 schemes that look good together and
00:11:04.440 convey meaning you don't have to worry
00:11:06.240 about that you just have to pick one of
00:11:07.680 them um you can preview them on the
00:11:10.160 graphis website they're slightly
00:11:11.680 different than the official Brewer color
00:11:13.040 schemes they've added extra colors in
00:11:14.519 some cases and you can also preview
00:11:17.760 color schemes at colorbrewer
00:11:19.440 2.com and I love this site so I'm going
00:11:21.800 to show a picture of it it gives you
00:11:23.600 this example map and then it lets you
00:11:25.760 pick colors in the top left and those
00:11:28.240 are some of the different colors you
00:11:29.639 click one it changes changes the map so
00:11:31.839 you can see how it's going to look but
00:11:33.519 the really powerful thing is that
00:11:34.959 they've got check boxes for color blind
00:11:37.120 safe printer friendly um photocopiable
00:11:41.160 so if you know that you've got some
00:11:42.279 color blind people you work with you can
00:11:43.760 make sure that you pick a scheme that's
00:11:45.440 color blind safe they also talk about
00:11:48.040 things like categorical data cats dogs
00:11:51.040 ferrets versus sequential or diverging
00:11:54.160 data small medium large some of those
00:11:56.880 are related and you can order them
00:11:58.600 others you can't can't really order cats
00:12:00.200 dogs and fets maybe you put them in
00:12:01.399 alphabetical order but the order itself
00:12:03.120 does not matter and they have different
00:12:05.160 color schemes for different things so
00:12:06.839 that you can use a color scheme that
00:12:08.160 conveys what you want it to
00:12:10.120 convey so here's an example using the
00:12:12.760 graph
00:12:13.839 gem again we're going to use filled
00:12:15.800 nodes and you have a color scheme and
00:12:17.920 the first thing is the first symbol as
00:12:19.720 an argument is the name of the color
00:12:21.160 scheme and the second argument is the
00:12:23.000 number of colors if you look at the
00:12:24.760 graph fiz website this is called set one
00:12:27.639 this is called set 14 it's actually set
00:12:30.279 one for colors so if you're looking at
00:12:32.680 the graph fiz website and you're trying
00:12:33.800 to figure out how to translate them to
00:12:35.199 using the graph gem just remember that
00:12:37.040 it's set one for colors not set
00:12:39.360 14 so you set the color scheme and then
00:12:41.920 you paint node a with C1 which stands
00:12:44.040 for color one paint node B with C2 which
00:12:46.120 stands with for color two and so on and
00:12:49.399 what you get out maybe looks like this
00:12:52.079 uh the first one is set one and the
00:12:53.920 second one is one of the pastel sets or
00:12:56.880 if you want to use a diverging or a
00:12:58.399 sequential scheme you get something that
00:13:00.199 looks like this and you can see how it
00:13:01.560 goes from Orange to purple as we move
00:13:03.800 through the color range another good
00:13:06.120 example of a sequential color is elction
00:13:08.040 night return Maps there's red counties
00:13:10.199 and blue counties and purple counties
00:13:12.040 and it's showing a movement from one to
00:13:14.440 the other that's just good example of
00:13:17.040 diverging data so color schemes are
00:13:19.519 great what does it look like when we use
00:13:20.959 them again a graph I showed you earlier
00:13:23.279 only this time I've added color the
00:13:25.320 green nodes are kindergarten material
00:13:27.680 the orange nodes are our first grade
00:13:29.959 material and that kind of periwinkle
00:13:31.639 note is second grade material and so
00:13:33.959 what becomes immediately obvious when
00:13:35.320 you add the colors is that we've got
00:13:36.680 this kindergarten node down at the
00:13:38.199 bottom it's completely surrounded by
00:13:39.959 first grade material it's probably a bug
00:13:42.839 but it wasn't obvious without the colors
00:13:44.600 there so that you could see that and now
00:13:46.639 it's really obvious and we can go fix
00:13:48.320 that that's the real power of using
00:13:50.199 these tools and you've seen how easy the
00:13:52.120 code is to
00:13:53.519 write um another tool we use a lot is
00:13:57.279 clustering and clustering allows you to
00:14:00.320 take graphs that look like
00:14:01.959 this and break them into something that
00:14:04.320 looks like this and it's still kind of a
00:14:06.680 mess but you can see that we've divided
00:14:08.199 the area into the set of nodes into two
00:14:10.680 separate areas and maybe we're deciding
00:14:13.199 okay we need to add more content which
00:14:14.839 areas are a little light okay well it's
00:14:16.759 really obvious that whatever's on the
00:14:18.120 right doesn't have nearly as much nearly
00:14:20.360 as many activities as whatever's on the
00:14:21.920 left so using clustering you can see
00:14:24.600 patterns like that that aren't obvious
00:14:26.759 clustering also can point out bugs in
00:14:30.199 maybe you've set up your relations wrong
00:14:31.680 and some stuff that is marked as related
00:14:33.519 actually isn't clustering is also really
00:14:36.000 easy to do um you just we just add
00:14:38.600 cluster blocks so we've got our normal
00:14:40.480 diagraph start and then we've got
00:14:41.839 cluster with the cluster's name and then
00:14:44.959 we have a do and then we give it a label
00:14:46.759 and we put some edges in it and again
00:14:48.639 because we're declaring those edges the
00:14:50.079 nodes are inferred and they're inferred
00:14:51.920 to be in that cluster so this example
00:14:54.360 gives you this graph one cluster with
00:14:56.360 nodes A and B another cluster with nodes
00:14:58.279 C and D and a third Edge a third node
00:15:01.160 and two edges that are outside of the
00:15:02.600 Clusters really easy to break stuff up
00:15:04.880 you could also do this with
00:15:06.600 responsibilities for departments or
00:15:09.320 things apps that are on different
00:15:10.839 servers it's really easy to see and you
00:15:12.440 can cluster them based on what you know
00:15:14.320 about the
00:15:15.240 data so I've showed a bunch of cool
00:15:17.320 stuff but I haven't showed exactly how
00:15:18.959 we take that giant scary XML file and
00:15:21.480 turn it into pretty pictures so let's
00:15:23.519 actually cover that so I chose a smaller
00:15:27.320 XML file for my example um slightly
00:15:29.959 different format we've got four lessons
00:15:32.720 and lessons have an ID and a name and we
00:15:35.519 also have sequence XML elements and
00:15:38.079 those to have a lesson ID and a prere
00:15:40.480 for that lesson ID sure you guys are all
00:15:42.759 familiar with prere from your time in
00:15:44.279 high school and college you have to take
00:15:46.759 French one before you can take French
00:15:48.360 two you have to pass algebra before you
00:15:50.199 can take calculus those are the kind of
00:15:52.040 prcts I'm thinking of so lesson id2 has
00:15:54.839 a prere of prere of lesson one so first
00:15:59.279 up extract the data I'm going to use
00:16:00.920 NOCO giri because it takes all the heavy
00:16:02.880 lifting of doing the XML out first time
00:16:04.959 I tried to do it by hand I don't
00:16:06.519 recommend that let someone else solve
00:16:08.360 your problems so we're going to open a
00:16:10.759 nooi document from our file I'm going to
00:16:13.480 use xath to create a collection of all
00:16:16.240 the elements that have lesson as their
00:16:18.440 name and a second collection of all the
00:16:20.480 elements that have sequence as their
00:16:21.800 name I'm going to pass the lessons and
00:16:23.720 the sequences the sequences of the pr
00:16:25.399 Rex to a function called Draw graph
00:16:29.639 here's the code for draw graph Digraph
00:16:32.600 do iterate through all the lessons put
00:16:35.199 in a node with the ID of the lesson and
00:16:38.759 the name and add a label with the
00:16:40.279 lesson's name same for the sequences
00:16:42.560 draw an edge from the prere to the
00:16:44.880 lesson ID it's like nine lines of code
00:16:47.759 and that's all you need for the drawing
00:16:49.759 the pretty pictures part and what you
00:16:52.040 get out at the end of this is this graph
00:16:55.279 and you probably could have figured this
00:16:56.360 out from the XML if you had spent a lot
00:16:57.920 of time looking on it but in the time
00:16:59.639 that it took you to figure it out we
00:17:01.480 could have written the Ruby it really is
00:17:03.160 about 10 lines of Ruby it's not hard and
00:17:06.079 the best part is this works no matter
00:17:07.720 what size that file is the file I showed
00:17:09.439 was really small it's exact same code
00:17:10.959 will work on my 2800 line file as
00:17:14.360 well and here's an example of what that
00:17:16.559 looks
00:17:17.520 like this is our file I've added colors
00:17:20.160 here um it's actually not the whole
00:17:21.640 thing the whole thing doesn't fit on a
00:17:22.880 slide it also doesn't as I said fit
00:17:24.439 through the plotter at Kinkos their
00:17:26.120 plotter is not wide enough uh we still
00:17:28.360 printed it anyway it's really small but
00:17:30.360 we use it it's posted on the wall at
00:17:32.400 work and the nice thing is is that we
00:17:33.840 can gather around it and talk about it
00:17:35.600 I've seen people run out of meetings use
00:17:37.520 their finger to trace a path through a
00:17:39.720 specific section of the curriculum and
00:17:41.480 then run back into the meeting because
00:17:42.880 they can actually follow what's going on
00:17:45.240 and we don't have another tool that does
00:17:46.919 that so this has been really powerful
00:17:49.320 and we've started into a really frequent
00:17:52.039 update cycle again recently so I've made
00:17:53.760 some changes to how we use it first
00:17:56.000 thing I added was automated updates I
00:17:57.919 was the master of the graphs for a while
00:17:59.919 and I had to run the commands on my
00:18:01.159 machine and I'm like I don't actually
00:18:02.640 want to be involved in this I want to be
00:18:04.720 able to go on vacation and how this all
00:18:06.280 just worked so we have a Hudson and a
00:18:08.440 Jin setup at at work and so I'm just if
00:18:11.320 you're familiar with Hudson I'm just
00:18:12.600 watching the directory where these where
00:18:14.280 that XML file lives it gets updated the
00:18:17.240 graphs are automatically created Hudson
00:18:19.120 pushes them onto a network share and
00:18:21.080 then it sends mail to anyone who's
00:18:22.520 interested and seeing the updates and
00:18:24.600 we've got a bunch of artists who work
00:18:26.200 with us and they can check that and see
00:18:27.640 how things are looking
00:18:29.280 the academic folks can look at it and
00:18:30.679 like yes that's what I intended and the
00:18:32.600 sales team can know what's coming at any
00:18:34.200 given time we point them at the share
00:18:35.640 it's always up to date it's always
00:18:37.039 correct it's also really good is we get
00:18:39.120 this feedback if that does not look
00:18:40.679 right almost instantaneously the whole
00:18:42.520 thing takes about 45 seconds to run and
00:18:44.960 push on our build
00:18:46.520 server um I've also been having a lot of
00:18:48.880 fun with graph just in general so one of
00:18:51.559 the awesome things that comes built into
00:18:53.039 graph is a way to visualize your
00:18:54.960 dependencies and so I'm going to show an
00:18:57.000 example using home brew go to command
00:18:58.919 prompt after you've installed the graph
00:19:00.320 Gem and type graph home brew if you have
00:19:02.520 home brew what you'll get out will look
00:19:04.840 something like this yours will probably
00:19:06.360 look more complicated my home brew
00:19:08.480 setup's pretty small and this shows the
00:19:11.200 the packages that have been installed
00:19:12.559 via home brew and their dependencies and
00:19:14.919 if there are red nodes I know that that
00:19:16.440 stuff's out of date and I need to update
00:19:18.039 it comes for free and it comes with a
00:19:20.880 ruby jems analyzer a home Peru analyzer
00:19:23.400 a FreeBSD ports analyzer and a Mac ports
00:19:26.280 analyzer so there's probably something
00:19:28.240 you can benefit benefit from right here
00:19:30.200 it's really great it's really handy and
00:19:31.799 you don't have to do anything you
00:19:33.039 installed the gem and it's there um
00:19:35.760 another thing you can use graph 4 I had
00:19:37.200 a really good time one night at Seattle
00:19:38.720 RB I'm like I want to see what our
00:19:40.039 schema looks like and I'm not sharing
00:19:42.440 our schema I'm sharing a schema of one
00:19:44.120 of my rails projects because it actually
00:19:46.080 fits on the slide but this took about 45
00:19:48.919 minutes to uh set up it's not perfect
00:19:50.760 uml because I personally find the uml
00:19:52.960 database scheme of stuff kind of
00:19:54.760 confusing but you could make it perfect
00:19:56.679 uml pretty easily just by getting the
00:19:59.120 arrows correct and most of that 45
00:20:01.320 minutes by the way we spent trying to
00:20:02.720 get those little Diamond arrows in there
00:20:05.120 as opposed to just standard arrows so
00:20:07.240 you can see the models and how they
00:20:08.480 interrelate it really wasn't that hard
00:20:11.480 it's I can show if anyone wants to do
00:20:12.919 the code I can show you it's only a
00:20:14.280 couple lines of
00:20:15.679 code um another really awesome thing
00:20:18.679 that I've chosen to do with graph is to
00:20:20.559 show history we've got a bunch of users
00:20:22.679 and they progress through that big graph
00:20:24.480 I just showed at varying rates of speeds
00:20:26.880 in varying orders and one of the things
00:20:28.400 that we like to push as a company is
00:20:30.640 that our product is individualized and
00:20:32.520 no one has the same experience but it's
00:20:34.960 really hard to tell people that and have
00:20:36.480 them take you for it so how do we show a
00:20:39.080 student's history um and this is a
00:20:42.120 static image showing a student's history
00:20:44.640 the blue stuff is stuff they've done
00:20:45.960 least recently the red stuff is stuff
00:20:47.840 they've done most recently and if you've
00:20:51.000 ever used the view logs and emacs the
00:20:53.760 colors are probably familiar to you it
00:20:55.600 shows the least recently changed lines
00:20:57.760 in blue and the most recently changed
00:20:59.240 lines in red and I showed this and
00:21:01.840 everyone was really excited because you
00:21:03.440 can kind of see the history but I
00:21:06.919 thought I could go a little bit further
00:21:08.559 and let's see if we could do
00:21:10.000 animation so this is an animation is it
00:21:14.320 running okay good it's running so this
00:21:16.600 is an animation of a single student
00:21:18.000 moving through the curriculum and as
00:21:20.559 they start they as they do an activity
00:21:22.600 the node turns bright red and then it
00:21:24.640 fades out over time as it goes into the
00:21:26.440 past and so you can see progress mov
00:21:28.600 moving and it's really motivating for
00:21:31.200 people because it actually shows what's
00:21:33.200 going
00:21:39.600 on and now it's finished and
00:21:41.760 everything's going to gradually Fade Out
00:21:43.799 and again it didn't take that long to
00:21:46.080 write this most of the time it took to
00:21:47.679 write this was figuring out how to
00:21:49.679 stitch a bunch of pgs together with FFM
00:21:51.919 Peg and getting my pgs small enough that
00:21:54.159 they rendered fast so putting the pieces
00:21:57.159 together once you had the base Gra which
00:21:58.720 I've already had working putting the
00:22:00.440 history on top of it wasn't that hard
00:22:02.840 and that's really the power of this
00:22:04.039 tools I had this idea and I'm like okay
00:22:06.159 let's see how hard it is to write and
00:22:07.559 it's really
00:22:08.640 not so I've left a lot of time for
00:22:11.799 questions and also so people can get out
00:22:14.000 of here and get to the party sooner if
00:22:15.679 they want but first some thank yous I
00:22:18.080 want to thank Ryan Davis for writing
00:22:19.400 graph and then rewriting it every time
00:22:21.240 I've showed him something or done a talk
00:22:22.760 on it to include some of the cool stuff
00:22:24.600 I've found and I want to thank Aon
00:22:26.559 Patterson for Nary because the first
00:22:28.559 time I implemented the XM parsing the
00:22:30.559 XML by hand it was very very very
00:22:33.360 painful and I'm never ever doing that
00:22:35.159 again noiri works great and it does it
00:22:37.559 fast and I can make the hard part
00:22:39.480 someone else's problem so uh thank you
00:22:43.679 and actually while I'm taking questions
00:22:44.960 if there are any I'm going to run a
00:22:46.039 longer animation of another student who
00:22:48.720 spent more time in the
00:22:50.039 curriculum see if I can get this one
00:22:51.840 started there we goes so any questions
00:22:56.720 comments yeah these animations wind up
00:22:59.720 in the teachers hands at any point yes
00:23:01.760 uh the question was do the animations
00:23:02.960 wind up in the teachers hands I showed
00:23:05.039 this at a meeting of just the team that
00:23:06.720 works on the recommendation engine and
00:23:08.720 they're like you need to send this to
00:23:09.799 everyone so it's ended up both in the
00:23:11.240 teachers hands it's also ended up in the
00:23:13.279 sales team's hand and I think the sales
00:23:14.720 team is more excited about it than
00:23:16.039 anyone else is at this
00:23:17.640 point
00:23:19.200 okay
00:23:21.279 slides um I can make the slides and
00:23:23.200 codes post I can make the slides uh and
00:23:25.240 the code that it's in the slides posted
00:23:26.640 somewhere um okay yeah I'll put them up
00:23:29.520 on my GitHub account um I'm kusali KI on
00:23:33.640 GitHub and I'll put them up this
00:23:36.279 evening
00:23:42.880 yes uh no I haven't heard of any other
00:23:45.559 tools if you have a suggestion
00:24:00.480 okay yeah come up and talk to me and
00:24:02.360 I'll see what we can
00:24:04.200 do yes
00:24:10.200 you it's funny you mention that um if
00:24:12.720 you've used Google perf tools they use
00:24:14.640 dot to
00:24:16.679 visualize which functions are run most
00:24:19.200 often and what your what Your call stack
00:24:20.799 is and what the paths are through the
00:24:21.880 code um their output format one of the
00:24:24.200 picture output formats is Dot and they
00:24:26.440 use they don't use graph but they use
00:24:28.120 the same idea and so it would be
00:24:30.000 wouldn't be hard at all to do that like
00:24:32.799 yeah maybe an hour or two once you knew
00:24:34.440 the graph
00:24:36.799 API okay well thank you very much
Explore all talks recorded at RubyConf 2011
+55