Where’s Ur Data?


[Hat Tip: Chris McMahon for pointing out this awesome video.]

The response I’ve gotten from the tiny bit of work I’ve done with software testing and visualization has changed the direction of my life and shown me that I am not the only visual thinker in software/software testing.  We don’t want to look at tables any more…we want pictures!  We want line and color!  We don’t want to be limited by text in our exploration and creation of software.

So where’s our visual jet-pack?  I mean, I might do crazy viz stuff at home, but at work, I’m still analyzing tables, bug counts and raw data every day.  What gives?

Unfortunately, the only answer I have to this question is another question.  Where is your data?  How do you access information about your tests?  In my experience, most of the data I’ve seen has been in a spreadsheet or on a screen, as in, “Hey you!  You want a piece of me, girlie, you gotta copy and paste!”

In his recent interview on the blog, Indirect Collaboration, Shawn Allen of Stamen Design was asked about the challenges of working with data.  His response rang so true, that it’s the main point of this blog post.  For Mr. Allen, “just getting the data in the first place is the most difficult part of the process, regardless of the source.”

Amen!  Hosannah!  Bravo! and Thx, dude!!!

Every visualization I’ve worked on has included significant challenge in just the beginning step of gathering the data.  If you are working with one table or with tables that play nice, u r lucky. More often than not, the interesting stories are teased from disparate sources in disparate formats.  In general, there is nothing pretty at all about the raw data used in some of the most intriguing visualizations.

Mr. Allen goes on to say that in the case of Stamen’s Crimespotting project, this challenge was overcome by being provided with a KML feed.  The keyword here is feed.

Do you have a feed for your defects?  I know I didn’t have one at my last job, and wouldn’t have known what to do if one had hit me on the head.  This is, however, the world we live in.  When I gave my talk on “Visualizing Software Quality” at Microsoft, one of the comments I got was that my work did not include any type of real-time feed.  It was a wake-up call for me and a challenge I’m working on.

Part of this challenge lies in learning to work with standard formats for data.  Allen mentions KML.  There is also XML and JSON.  I have heard testers bemoaning the “pointy things” of XML, and I hope that we’re past the kvetching.  I’ll admit that I’ve mostly used tab or comma delimited data for my work, but I’m REALLY over them.  Just because I know enough about regular expressions to wear the XKCD t-shirt doesn’t mean I want to spend my life parsing data with them.

If we are ever to be successful at visualizing software quality, we must have feeds from our tests, defects and even the code we are testing.  I don’t want to spend my time figuring out how to get my testing meta-data to play nice.  I would much rather spend my time figuring out which data belongs together and understanding the story it tells.  After all, that is the real value in visualization, no matter what type of picture we create at the end.

Reblog this post [with Zemanta]

2 thoughts on “Where’s Ur Data?”

  1. I hope you’ll be pleasantly surprised by the “feeds” of data that your new employer’s products can produce. Almost every interesting or useful thing in JIRA can be pulled out via an RSS feed.

    In the absence of a useful feed, maybe you could try bringing gifts to the resident toolsmith or script-lover: I can never resist a data sculpture challenge! :)

  2. James is correct and, in fact, one of the reasons why I think Atlassian’s products are so great is that they offer multiple access points for users.

    In my treemaps project, I found myself seeking out projects using Jira because the data was so much easier to scrape.

    Even at the stage where I was copying and pasting a screen, their data was much cleaner than other applications. This is a very big deal because it goes back to my 2nd post on quality (what is quality? what is art? part deux. …part trois is in the works).

    At some point any product’s most creative users, I was calling them gatecrashers in my post, will find a way to use that product in a way not originally intended. A really great product will have a decent level of tolerance for this.

    Feeds are one way of providing this tolerance and a capability that, I believe, sets Atlassian apart from many other companies. Go Team!

Comments are closed.