Sometimes peer pressure is a good thing. Today I got an unexpected tweet from Daniel Woodward a.k.a @woodybrood asking about a FedEx project where I visualize JIRA issues. I’ve now done 2 FedExes. In my first one, I collaborated with Anna Dominguez to create a network graph based on comments in JIRA and Confluence. It was a map of who was commenting on whose issues and pages. I wrote a blog post about that one here.
My second FedEx was an attempt at visualizing code churn as a horizon graph using data from Atlassian’s source code analysis tool, Fisheye. It turned out to be one of my more unsuccessful attempts at visualizing. I couldn’t get code churn data that really meant anything and I realized that code churn should not be visualized as a horizon graph when I started putting the graphic together. (That was painful.)
So, Daniel, the short answer to your question of whether I have a great way to visualize defects in JIRA is: not really. I have put JIRA issues into a treemap before, but I wouldn’t recommend that either.
Where does that leave things?
It leaves me feeling a little frustrated but still curious. I refuse to believe that the data from bugs is not to be visualized. My gut tells me that I just haven’t found the right questions to ask or the right style to use for visualization. I do have one more trick up my sleeve before I’m completely out of ideas. Here are the pieces:
I have, in the past, visualized counts of fake defects with the parallel coordinate plot software, ParallelSets. I already hear screams about visualizing bug counts so at least let me explain before you flame me. When I did this, I was very happy with the results. However, the version of the software I was using was not robust enough for me to use it on a regular basis. It’s been updated to be more robust with data, but the newer version has the side effect of not showing the data points individually, plus it only works on windows and I’m more into mac. When it was working for me, I really loved it. They really nailed the interaction between the user and the data. It really pains me to say it, but I’ve moved away from using parallel sets for now.
What I noticed a while ago is that the visualization language protovis has an example of parallel coordinate plots. I encourage everyone who is interested in visualization to play with protovis. It’s from a group at Stanford and uses javascript. I’ve followed the first tutorial for protovis published by Dr. Robert Kosara, and it’s pretty cool.
So I’ve got my visualization idea. How do we get it out of JIRA? In the past I’ve gotten information about JIRA defects into a spreadsheet and maybe also csv. One of the reasons I liked Atlassian so much before they hired me was that data exports pretty cleanly from JIRA. There is no way to overstate how much easier that makes visualization. Unfortunately, once I got the data out, it did not work in a treemap. Even though getting the data out through the UI is not that bad, I’d like to try something I’ve done some fiddling around with in the past year: REST.
JIRA has just released 4.2 which contains a new version of their REST api, and I love the documentation they have for it. It makes working with the api extremely accessible, much like the docs on twitter’s api. They’ve got curl examples and a script you can use to make a graph of links. They also have a page for simple REST examples which is not completely filled in.
Here’s what I’m gonna do:
Use the new JIRA REST api to create a data set in Ruby to be used for creating a parallel coordinate plot of JIRA issues. If I’m lucky, I’ll get 20% time to do this. I know some guys who, I’m guessing, wil help me get the example in ProtoVis to work with the JIRA data. I bet I can have something together by next month. The goal would be to provide a ruby example for the page on “The Simplest Possible JIRA REST examples” along with some javascript that shows the data in a parallel coordinate plot using protovis.
Readers are officially allowed to hassle me about finishing this before Christmas 2010 on twitter and in the comments here.
I’ve done quite some analyzing of software defects (the software I test is visualization software), and I don’t have an answer for you.
But I have a question for you:
What do you want to find out about the data?
I have not seen any test-related visualization that in itself is great. You always need to know about details, in order to tell what it really means. And you also need to know a lot about your data (i.e. your project/product) in order to setup the visualizations that help you analyze your data. And depending on the data, and your questions, the visualization needed could equally well be Line Charts, Heat Maps, Bar Charts, Scatter Plots et.al. Maybe you need to try several in order to see what you already had a hunch about.
Probably you need to dig deeper sometimes, and then step back to the whole picture.
Besides looking for trends, patterns, outliers and unanticipated relationships, I I have learnt to ask: what is not there?
So my point is: visualization is a tool in order to analyze your specific data.
If you have success in just doing visualization, I will be over-whelmed.
Thanks, Rikard, for commenting. I’m now making guesses about which software you test ;)
The question you pose is, indeed, the most important question one can ask of a visualization. My approach, however, is one of seeking something new from the data I have about defects. Is there something else they can tell us? Do we need to change the data we collect? I don’t know.
At this point, I’ve decided that bug counts, much as testers may complain about them, are not going away. As long as there are bugs and we perceive them as discrete units, there will be some form of counting them.
Perhaps the question I am looking to answer with using parallel coordinate plots is can we give bugs more context? What if you could see more about the bug count in addition to the count itself?
Aside from that, I am eager to improve my ability to get data about issues and tests. One obstacle to my visualization hobby has been acquiring lots of data quickly. I’m tired of requiring a spreadsheet application any time I want to visualize something. I’d rather be able to get data through api calls and focus more on the front-end side of information design. I’m sure most of this project will be spent on getting the data out of JIRA and formatting it correctly for any type of visualization.
I haven’t seen test visualization that is great either, but I love both testing and visualization and so have decided that it is an area I’d like to try and improve. I expect to fail a lot. I expect a lot of criticism (and I’m not disappointed so far), but if I will ever get something worthwhile out of these two topics. If there is ever to be success, I’m convinced that the failures are essential. I do my best to take a lesson from them and wear the failures like a great pair of jeans that has been patched a few times.