Testing and Coding…Concurrently
Software Testing
CAST 2010: Software Testing the Wiki Way
Jul 27th

- Image by Lachlan Hardy via Flickr
On Saturday, I will be flying from Sydney to Grand Rapids, MIchigan for this year’s CAST. I feel quite privileged for 2 reasons: the organizers of CAST have included a presentation of mine in their scheduled sessions and Atlassian (in particular my boss at Atlassian) have seen enough potential in the presentation to fund my trip.
Software testing with a wiki is a topic that I started working on quite a while ago, and it’s still an ongoing process. My presentation began as a potential blog post last Winter. After a bit of nudging by Matt Heusser and Chris McMahon, I submitted my would-be blog post as a CAST session and it was accepted. It is quite humbling to be approached and encouraged by testers/people I admire the way I admire Chris and Matt. I was also humbled by the retweets I got from testers showing their desire to see me present at CAST.
I could take the easy way out with my topic and write up another “talking head” presentation, but I’ve decided to go all WTANZ and do this thang Aussie-style (fearlessly and head-first). I’ve made a prezi which I will quickly present at the beginning of the session. I’m posting it here, because I hope everyone looks at it before they show up. I hope y’all bring your laptops to the session charged and ready. After I breeze through the prezi, the rest of the session will be all testing. it will be done through a wiki. It will be fun.
WTANZ 06: Visually thinking through software with models
Jul 11th
Last year, I visited Microsoft to give a presentation. Alan Page, one of the authors of How We Test Software at Microsoft was my host. When he introduced me to the audience, he gave me an autographed copy of his book and a pad of quadrille paper (paper with squares instead of lines). He told me it was for drawing models. Apparently this is quite the popular way for testers to understand software at Microsoft (and I hear, at Google as well). I’ve read a lot of HWTSM but I must admit, I had not looked very closely at the model based testing chapter.
The paper Alan gave me has made it to Australia, and I’ve been using it for keeping up with life and stuff. Every time I look at it, however, I keep thinking, “What the hell is this model-based testing about?” So I decided we would check it out for this week’s weekend testing. I told Alan of my plans on twitter and he replied:
Hmm…keeping this in mind, off I went to read his chapter on model-based testing. The models are really just finite state machines. If you’ve taken a discrete math class, you’ve seen these. If you haven’t…it’s pretty simple. Have a look at wikipedia’s example and you’ll see what I’m talking about. In reading through HWTSM I noticed that emphasis was placed on using models to understand specifications. Dr. Cem Kaner’s sessions with Weekend Testing from a while ago verified this to me. I read through the transcript from Weekend Testing 21 in which Dr. Kaner is describing his suggested use of models.
Dr. Kaner suggests 2 ways in which models can be helpful. The primary reason he suggests for using a model is when there is so much information that a tester is having difficulty wrapping their head around all of the possible states they can create and need to test within a system. In this case, model-based testing is used to deal with information overload. It looked to me as though he was less concerned about necessarily having a finite state machine and more concerned with the tester having some way of visually mapping the system in a way that made sense.
The secondary reason for using a model, was as a way to approach sensitive devs about holes in their logic. Saying to a dev, “here’s this diagram I made of the system, but I seem to have a gap, can you help me fill this in?” is much less confrontational than approaching them with their spec and telling them they forgot stuff.
Dr. Kaner’s primary reason for using a model intrigued me because it is contrary to Alan Page’s suggestion in HWTSM that models can get too big. Dr. Kaner is using models as a remedy for information overload and he uses a decision tree he made showing reasons to buy or sell stock as an example. It’s not a small picture, but maybe that’s because I’m not used to testing with models or even looking at them on a daily basis. Here’s what Alan has written about the size of models:
Models can grow quite quickly. As mine grow, I always recall a bit of advice I heard years ago: ” ‘Too Small’ is just about the right size for a good model.” By starting with small models, you can fully comprehend a sub-area of a system before figuring out how different systems interact.
-Alan Page, How We Test Software at Microsoft. p162 (inspired by Harry Robinson)
This week’s mission, is to make some models and compare notes about how successful this strategy is for varying levels of model size/complexity. Since all iGoogle gadgets have some type of specification, I picked a few google gadgets:
Small: The Corporate Gibberish Generator
Medium: XKCD
Large: Thinkmap Visual Thesaurus
Since models can also be useful for api’s, if anyone is feeling super-geeky, you can try modelling some api calls from twitter. I just blogged using twitter with curl, so that might help those choosing to do api modeling. Alan writes about how modeling can be useful for testing api’s and it made me very curious. (Off topic, I have to wonder: what happens when you throw some models at a genetic algorithm? Learning? Useful tests? Who knows. I’m saving that one for later.)
I also have what I’ll call a modeling “experiment.” This may or may not work. It may or may not teach you something, but I think it will make your afternoon/evening/morning interesting to say the least. This link is to a painting in the Museum of Modern Art. The painting is The City Rises by Umberto Boccioni. As I read about Dr. Kaner’s approach of using modeling to combat information fatigue, I was immediately reminded of this painting. There is so much going on in this painting and, if you explore, you will find relationships that make it a masterpiece. Can a model pull out and define the power of this painting? Let’s find out.
For graphics software, I suggest Gliffy. It is browser based so no worries about operating system, and for our purposes, sign-up is not required.
90 Days of Manual Testing
Jul 4th

- Image by Marlena Compton via Flickr
My “probationary” period at Atlassian has recently finished. This period has lasted 90 days, although I feel like I’ve been here much longer. Lots has happened since I’ve shown up in Sydney. As was pointed out to me on twitter by @esaarem, I’ve participated/facilitated in 4 sessions of Weekend Testing Australia/New Zealand. I’ve turned in a paper for CAST. I flew back to the states for a brilliant Writing-About-Testing conference and I went on a vision quest, of sorts, in the canyons of Grand Gulch.
What hasn’t shown up on my blog is all of the testing I’ve done for Confluence. This testing and work environment is such an utter departure from what I was doing previously. Before, I was looking at a command line all day, every day and writing awk and shell scripts as fast as I could to analyze vast amounts of financial data. This was all done as part of a waterfall process which meant that releases were few and far between. To my previous boss’s credit, our group worked extremely well together and he did as much as he could to get the team closer to more frequent releases.
I am now testing Confluence, an enterprise wiki, which is developed in an Agile environment and completely web-based. I haven’t run a single automated test since I’ve started so it’s been all manual testing, all the time. This doesn’t mean that we don’t have automated tests, but they haven’t been any responsibility of mine in the past 90 days. My testing-focus has been solely on exploratory testing. So what are my thoughts about this?
On Living the “Wiki Way”
Since everything I’ve written at work has been written on a wiki, I haven’t even installed Microsoft Office on my Mac at work. I’ve been living in the wiki, writing in the wiki and testing in the wiki. If the shared drive is to be replaced by “the cloud” then I believe the professional desktop will be increasingly replaced by wikis. Between Atlassian’s issue tracker, JIRA and Confluence there’s not much other software I use in a day. Aside from using Confluence to write test objectives and collaborate on feature specifications, I’ve been able to make a low-tech testing dashboard that has been, so far, been very effective at showing how the testing is going. I’ll be talking about all of this at my CAST session.
On the Agile testing experience:
For 5 years, I sat in a cubicle, alone. I had a planning meeting once a week. Sometimes I had conversations with my boss or the other devs. It was kind of lonely, but I guess I got used to the privacy. Atlassian’s office is completely open. There are no offices. The first few weeks of sitting at a desk IN FRONT OF EVERYONE were hair-raising until I noticed that everyone was focusing on their own work. I’ve gotten over it and been so grateful that my co-worker, who also tests Confluence, has been sitting next to me.
During my waterfall days, I had my suspicions, but now I know for sure: dogfooding works, having continuous builds works, running builds against unit tests works.
On Manual, Browser Based Testing:
This is something that I thought would be much easier than it was. I initially found manual testing to be overwhelming. I kept finding what I thought were bugs. Some of them were known, some of them were less important and some of them were because I hadn’t set up my browser correctly or cleared the “temporary internet files”. Even when I did find a valid issue, isolating that issue and testing it across browsers took a significant amount of time. All of this led to the one large, giant, steaming revelation I’ve had in the past 90 days about manually testing browser based applications: browsers suck and they all suck in their own special ways. IE7 wouldn’t let me copy and paste errors, Firefox wouldn’t show me the errors without having a special console open and Apple keeps trying to sneakily install Safari 5 which we’re not supporting yet.
Aside from fighting with browsers, maintaining focus was also challenging. ”Oh look there’s a bug. Hi bug…let me write you…Oh! there’s another one! But one of them is not important…but I need to log it anyway…wait! Is it failing on Safari and Firefox too?” I don’t have ADD, but after a year of this I might. Consequently, something that suffered was my documentation of the testing I had done. I was happy not to have to fill out Quality Center boxes, but it would be nice to have some loose structure that I use per-feature. While I was experiencing this, I noticed a few tweets from Bret Pettichord that were quite intriguing:
Testing a large, incomplete feature. My “test plan” is a text file with three sections: Things to Test, Findings, Suggestions1:53 PM Jun 22nd via TweetDeckThings to test: where i put all the claims i have found and all my test ideas. I remove them when they have been tested.1:54 PM Jun 22nd via TweetDeckFindings: Stuff i have tried and observed. How certain features work (or not). Error messages I have seen. Not sure yet which are bugs.1:55 PM Jun 22nd via TweetDeckSuggestions: What I think is most important for developers to do. Finishing features, fixing bugs, improve doc, whatever.1:57 PM Jun 22nd via TweetDeck
This is something I’m adding to my strategy for my next iteration of testing. It made me laugh to see this posted as tweets. Perhaps Bret knew that some testing-turkey, somewhere was gonna post this at some point. I’m quite happy to be that testing-turkey as long as I don’t get shot and stuffed (I hear that’s what happens to turkeys in Texas). After I do a few milestones with this, I will blog about it.
Because of my difficulties with maintaining focus, I’ve now realized that while it’s easy to point the finger at developers for getting too lost in the details of their code, it’s just as easy for me, as a tester, to get lost in the details of a particular test or issue. I am a generalist, but I didn’t even notice that there was a schedule with milestones until we were nearly finished with all of them. That’s how lost I was in the details of every day testing. Jon Bach’s recent blog post resonates with me for this very reason. He writes about having 20 screens open and going back and forth between them while someone wants updates, etc. Focus will be an ongoing challenge for me.
One of the few tools that I’ve felt has helped me maintain my focus is my usage of virtual machines for some of the browsers. They may not be exactly the same as using actual hardware, but being able to copy/paste and quickly observe behavior across the different browsers was hugely important in helping me maintain sanity.
The past 90 days has been intense, busy and fascinating in the best possible way. Does Atlassian’s culture live up to the hype? Definitely. I’ve been busier than at any other job I’ve ever had, and my work has been much more exposed, but I’ve also had plenty of ways to de-stress when I needed it. I’ve played fussball in the basement, I laughed when one of our CEO’s wore some really ugly pants that I suspect were pajama bottoms to work, I got to make a network visualization for FedEx day and my boss took me out for a beer to celebrate the ending of my first 90 days. I like this place. They keep me on my toes, but in a way that keeps me feeling creative and empowered.
By the way, Atlassian is still hiring testers. If you apply, tell ‘em I sent ya
What is quality? What is art? Part deux
Feb 12th
I’m so appreciative of the discussion that developed from my previous post. I could see that people commenting were really digging deep, so I decided to address some of what was said in this follow-up post.
Here are some of the comments about the definition of quality:
Michael Bolton shared his perspective on Jerry Weinberg’s definition: “To be clear, Jerry’s insight is that quality is not an attribute of something, but a relationship between the person and the thing. This is expressed in his famous definition, ‘quality is value to some person(s).’ ”
Rikard Edgren’s definition: “Quality is more like “good art” than “art”, but anyway: I can tell what “quality to me” is when I see it. I can tell what “quality to others” is when I see it, if I know a lot about the intended usage and users.” Rikard also wrote a post where he clarifies his position a bit.
Andrew Prentice wrote about what he feels is missing from Weinberg’s definition: “I like Weinberg’s definition of quality, but I’m not convinced that it is sufficient for a general definition of quality. Off the top of my head I can think of two concepts that I suspect are important to quality that it doesn’t seem to address: perfection and fulfillment of purpose.”
The definition of quality that I learned is from Stephen Kan’s book, Metrics and Models of Software Quality Engineering. Interesting is that Kan shows a hearty and active disdain for what he says is the “popular” definition of quality. “A popular view of quality,” he writes, “is that it is an intangible trait—it can be discussed, felt, and judged, but cannot be weighed or measured. To many people, quality is similar to what a federal judge once commented about obscenity: ‘I know it when I see it.’ This is sounding familiar, no? Here is where the pretension begins to flow: “This view is in vivid contrast to the professional view held in the discipline of quality engineering that quality can, and should, be operationally defined, measured, monitored, managed, and improved.’ ” Easy, tiger. We’ll look at this again later.
Jean-Leon Gerome’s painting of Pygmalion and Galatea brings this discussion to mind. This is a link to themyth of Pygmalion and Galatea.
I’ve seen this painting in person, at the Met. Interesting to note is that the artist was painting himself as Pygmalion in this painting. (and I like listening to “Fantasy” by the Xx while I look at this.)
The relationship in this painting is not limited to the one between Pygmalion and Galatea, the viewer is drawn into the relationship as well and the artist, himself is also participating. In this painting, Pygmalion has been completely drawn in by his own creation. The artist was so drawn in by the story that he painted himself into it. I was and am still so drawn in by the painting that it is simply painful for me to tear my eyes away from it. It slays me. When I see it, I feel the painting. I guess you could say that emotion is an attribute of this painting, but in this case, I think it’s more. In this case, the emotion is the painting. Why else does the painting exist? Would this painting work at all if the chemistry were missing? I don’t think it would. What Gerome has accomplished here is the wielding of every technique at his disposal to produce a painting with emotion as raw, basic and tantalizing as the finest sashimi.
But there is more to this relationship than just the fact that Gerome has painted himself as Pygmalion. Let’s examine the relationships that exist in this painting and what they tell us. Starting with just the painting, itself, we have the man and the woman locked in their embrace. They are surrounded with many objects. (I encourage all readers to click through to the Met’s web site. Looking at their web-site, if you double click on the painting, you can move around and zoom in and out to get a closer, more focused look.) What do you notice about all of the objects in the room? I’ve no doubt that some of you are wondering if these objects take away from the focus in the painting. If that were the case, if the painting consisted of only the man and the woman, how would we know that the man was an artist? So why do we need these particular objects? The painting could be restricted to just the hammer and chisel so what’s with all the stuff? This is where our relationship with the painting deepens should we choose to follow the breadcrumbs…
An overview of Gerome’s life, clarifies his choices. As a young artist, he spent a year in Rome which he felt was one of the happiest years of his life. At the time that Pygmalion and Galatea was painted, Gerome was grieving over the deaths of several relatives and friends. By surrounding himself with artifacts from his youth, the artist is traveling back in time to a younger, more “Roman”-tic time in his life. However depressed he may have been when he painted this, Gerome was also experiencing an artistic breakthrough in his sculpting career. Notice the breakthrough in the painting? Now that you know a bit more history, how do you feel about the painting? Does it change your perspective? This has made the painting very introspective for me. The emotion that flows from this depiction of romantic love is one of vitality and power. Perhaps Gerome is evoking these feelings as a way of tapping into his own creative powers. I remember thinking to myself when I first saw this painting at the Met, before I knew anything at all about it, “She is rescuing him.”
To describe quality as a relationship gives it a larger meaning and captures something neglected and dismissed by the literature of the “software crisis” era e.g. books such as Stephen Kan’s. Is quality as a relationship mutally exclusive to quality being an attribute of software? I don’t agree with describing quality as just an attribute. To say that quality is an attribute de-emphasizes the holistic approach to quality I try to take and for which I’m assuming Michael, Jerry Weinberg (going by his definition here only), agile, context, et. al are striving. (Full disclosure: I haven’t read any of Jerry Weinberg’s books. That does NOT mean they are not on my list. I just got out of school and the only thing I’m reading lately is visa paperwork so give me a break here.)
The software we test has its creators and has an audience of users as well. Just as Gerome had his own relationship with this painting, developers know what they want to see which leads to the building of their own relationship with the software they make. How does this affect the relationship between the software and its audience
How does value fit into this? I value the painting because of how it makes me feel when I look at it. After the examination I did, I now understand why I value the painting. As someone who is constantly seeking artistic inspiration, I am happy to go where Gerome and his muse take me. What does this say for value in software? Does the relationship between an audience of users and software create value for the audience members whether they are paying guests or not? The more I dig into this definition, the more I like it because it allows for gatecrashers, those who we did not think would be using our software, but who may find it so invaluable, they become our software’s greatest fans.
I’m going to marinate on this while I think about the 2nd part of Andrew’s comment, namely, that Mr. Weinberg’s definition of quality does not address perfection and fulfillment of purpose. After all, Kan’s two definitions of quality of “fitness for use” and “conformance to requirements” are fairly widely accepted in software.
What are you thinking? Is there something missing from Jerry Weinberg’s definition? How does measurement fit into what I’ve been writing about if it fits at all?
I leave you to think about this and the painting above. If you haven’t already, take a few minutes to click through and take a good, honest, langorous look. Put down the twitter, the kid, the spreadsheet, the reality tv show. Take some deep breaths and give yourself a few moments alone with Pygmalion and Galatea.
to be continued…
Look Up, Don’t Look Down: Testing in 2010
Jan 2nd

- Image by -Alina- via Flickr
This post reflects what I’d like to see for software testing in 2010. It is a purely selfish list. Most of what I’ve written about below will find its way into my blog over the next year. The list is not in a particular order, that’s why I excluded numbers for each item. I’m just so damn excited about all of it. (and yes, I stole the title from TonchiDot)Btw, I’ve changed my template, my “about” page and my blogroll.
How does my list compare with what you would like to see?
Testers get fed up with their massive tables of data and turn to visualization
Ok, so no surprise here, but I wouldn’t have picked it for a thesis if I didn’t think it was important. Testing meta-data is all around us, and we’ve yet to fully make sense of it. What is it trying to tell us? If we don’t want to boil everything down to a metric number, that doesn’t mean that the meta-data or the secrets it keeps is going away. In reality, we will only have more meta-data. The challenge lies not only in getting our data into a visualization but also in knowing what and how to explore without wasting time. When should we use a scatterplot vs. treemap vs. plain-and-simple bar graph? This goes way beyond anything the Excel wizard will tell us, but that doesn’t mean we won’t need a little magic.
Functional Programming Shows Up on Our Doorstep
I’ve been seeing devs tweet about FP all year, and I’m quite jealous. If a dev gave you unit tests written in Haskell or Erlang, what would you do? Testers aren’t the only ones with meta-data overdrive. Our massively connected world is producing too much info to be processed serially. Get ready for an FP invasion. Personally, I’m looking at Scala.
Weekend Testing Spreads
Indie rock fans will smell BS if they see an indie rock countdown for 2009 without Grizzly Bear (had to work it in somehow). Weekend Testing is obviously the Grizzly Bear of Software Testing for 2009 and their momentum sets a blistering pace. Markus Gaertner has just announced that it’s expanding to Europe and I’m certain it will spread across the Pacific as well. This is a bottom up method for learning how to test, and I hope that instructors of testing take note. I am no expert at testing and want to do whatever I can to set the bar as high as possible. Hey Weekend Testers, count me in!
Testers who don’t blog start to care about their writing skills
With an emphasis on tools that get software process out of our frakking way, we’ll be left with our writing. Ouch. What’s a comma splice? Hey, I’m going for my Strunk & White. All the great collaboration tools in the world aren’t going to help us if our writing skills suck.
Links Between the Arts and Software Testing Will Be Strengthened
Chris McMahon started us off with his chapter in Beautiful Testing. Shrini Kulkarni blogged about learning the power of observation by looking at art. I’ve been reading about exploratory analysis using data and visualization. By the end of the year, I want software testers besides those of us who self-identify as arty or musical to be talking about why arts education is vital for being a good software tester.
More testers start to care about understanding the fundamentals of measurement and the basics of statistics
Think fast: What is the difference between ratio and proportion? When does the mean not tell an accurate story about a set of numbers? It’s very clear that there are some serious pitfalls in the usage of metrics. What I haven’t seen is lots of testers that have a thorough understanding of basics such as levels of measurement or what a distribution will tell you. I wonder how many testers back away from using these because they don’t understand exactly how they can be harmful or because they just don’t understand exactly how they work in the first place. One assignment I’ve given my blog for the year, is to tackle some basics as applied to testing. Rejecting metrics because you see how they can harm is one thing, rejecting metrics because you don’t understand them is unfortunate. If you count yourself as a tester who is not totally comfortable with math, you’re not alone and, believe me, I understand how you feel.
Collective Intelligence Comes into Play
If I had my way, this list would be vote-able and each reader would have the ability to vote items to the top or bottom. Wouldn’t that be interesting? Unfortunately, I don’t have that…today ;o) But we’re so close! If we’ve got the technology together to analyze the hell out of our blogs through web analytics, what about our tests? I’m picturing myself writing out tests in a wiki with a zemanta-like tool suggesting tests from similar stories that have previously caught bugs. I might not always use these suggested tests, but it would be a great help for brainstorming.
I’ll have an Open Source Project Up and Running for Visualizations to be Used with Testing
This is not a resolution, it’s something I didn’t finish from last year. I am just so late on this. Oh well, giving myself a conduct cut. Seems I had a little conference talk to deal with which quickly morphed into a little talk at Adobe, followed by a little talk at Microsoft. Needless to say, I’ve got some unfinished business that has to do with treemaps. The PNSQC experience was a semester in and of itself. Time to get back into the visualizations.
It’s not lost on me that my last few posts have been sort of personal and high-level. I’ve had big changes and events happening in my life, which has made maintaining focus, well, difficult. You’ll hear all about it soon enough. Trust me, it’ll be good.
Visionary Testing: When Blogs Collide
Oct 12th
What the hell does some ancient chick in a dress have to do with software testing? I’m not paid to look at artsy-fartsy pictures! I’m paid to break stuff and pass it on to the devs to figure out!
Is that so?
How many times have devs come back to you for clarification on a bug report you’ve written? How much does the testing you do depend on your ability to notice not only the functionality of an application but the relationships among different functionalities? You see the chick in the dress? She was painted by Artemisia Gentileschi, another chick in a dress who was fairly bitter about life in general, and with good reason. Paintings hide plenty of secrets, just like software applications hide plenty of bugs. As software testers, sometimes we have prior knowledge of the story and sometimes we don’t. Regardless, our task is to ensure that the story makes sense for users, and when it doesn’t we have to report to the developers what is not making sense.
Three of the blogs in my testing blog folder on google reader(the blog roll posted here needs an update) contained posts this week that fit together incredibly well. I think they fit together because they highlight the need in software testing for observation and communication skills.
The first is Shrini K’s blog, Thinking Tester. Shrini blogged about “Necessary Tester Skills” and included this link to an article on the Smithsonian Magazine’s web-site. It’s about police officers in New York City taking a class about observation taught by an Art History scholar, and is a very rewarding read. What these officers are getting out of their trip to the Met is a lesson in how an effective description can radically change outside perception. That’s all I’m gonna say because I think you should read it.
The second post was written by Catherine Powell on her blog, Abakas. Catherine is writing about “Magic Words” in testing. I’ve seen this stuff defined in my metrics textbook and other various places, but what Catherine adds is her $2.00 on how these words are generally perceived.
Put these together with Elisabeth Hendrikson’s astounding post on Test Case Management systems, and I see the writing on the wall, or ahem, wiki. Why shouldn’t we eventually communicate our test efforts by writing down, in a somewhat domain-specific language what we see an application doing? If we are writing in a domain specific language and we have semantic web “stuff” at work, behind-the-scenes, why wouldn’t our stream-of-consciousness writing turn into tests and defects? Having a language, however, won’t matter at all if we lack the ability to employ careful observation in our testing.
I have a challenge for you. There’s no prize involved, but you might find yourself feeling rewarded. I challenge you to find a work of art be it a painting, sculpture, installation or anything you deem “art-worthy” and study it. This can be in a museum, a coffee house or your mom’s living room. Once you feel you have an understanding of what you are looking at, try to communicate your understanding with words. Extra points to you if you can also communicate what you’ve written in a language other than your own. Did you write about events taking place or where you describing some objects on a table? Were you thinking about light and shadow or did the materials used in the art catch your eye? If you are describing a portrait, does the painting possibly capture more of the person’s spirit than a photograph?
Is this really so different from trying to communicate what you’ve noticed in a test?
How to Solve It: The Tao Te Ching of Testing
Sep 30th
A few weeks ago, I wrote about tearing down all of my initial ideas about automated testing and even testing in general. Even though I’ve decided the automation I was building was taking my testing down a road I don’t want to travel, development and project plans continue. We have CM resources looking at my automated tests for consumption as smoke tests. I have to move onward.
In rebuilding my system and my ideas about testing, I’ve pulled in a resource given to me well over year ago by my good friend, Gordon Shippey. When I was flailing around as an Absolute Beginner in testing, Gordon noticed this table from wikipedia pinned above the monitor in my cubicle. I’m sure I found this through Slashdot or some equivalent. Gordon, who has done a lot of research in Artificial Intelligence and psychology, showed up in my cube the next day with the book How to Solve It by Georg Polya. I’m ashamed to say that it’s been languishing on my shelf for the better part of a year and a half.
The ambiguity of the title reminds me of a book I and my classmates were forced to read in college, How We Know. (That professor, by the way, was a Taoist.) I recognize that there are good reasons for these generic sounding titles, but they intimidate me because they are so general and imply more depth than I typically look for in my reading. I am just NOT the type to sit around pondering everything. If Gordon hadn’t placed Polya in my hands and shown me how accessible it is, I don’t think I would have come back to it. In actuality, it appears Polya intended that his book be read in small sections of no particular order. For someone like me who cannot find the time to read anything straight through these days, this is exactly what I need and is partly why I consider this book the Tao of Te Ching of Testing.
The other reason I am calling this book the Tao Te Ching of Testing is because of the attitude with which it was written. This book does not come from ego. In my limited reading and pondering of the Tao Te Ching, I noticed that a very strong message was, “check your ego at the door because this world is not all about you.” In Maslow’s hierarchy of needs, helping others is at the top of the self-actualization pyramid. Georg Polya has long departed this earth, but his honest interest in helping people solve their problems gives me the impression that he didn’t suffer from the expanding head syndrome that afflicts many great thinkers and some great testers too.
After getting past the title, I started asking what this book has to do with testing. Aren’t the developers the ones solving the problems? Well, yes, sort of, and they ought to be reading this too. Regarding testing, I think that this book is an aid in going down the path of exploratory analysis. Y’all saw me write about that, and I’m still figuring it out.
Consider this: Polya is writing about ways to investigate problems in order to solve them. Let’s shorten that: Polya is writing about ways to investigate. Semantics…gotta love ‘em. Sometimes.
Since I now know I can automate the hell out of just about anything I want, it’s time to learn more about investigating and questioning. Umm, I guess that would be testing. This is not to say that I’m abandoning automation altogether. In fact, I’m also pondering this blog post written by Bj Rollison a couple of years ago that discusses balance between automation and manual testing. One thing is for sure, learning how to test has been far more challenging than all of the programming classes I took put together.
Visualizing Defect Percentages with Parallel Sets
Sep 24th
Prof. Robert Kosara’s visualization tool, Parallel Sets (Parsets) fascinates me. If you download it and play with the sample datasets, you will likely be fascinated as well. It shows aggregations of categorical data in an interactive way.
I am so enamored with this tool, in particular, because it hits the sweet spot between beauty and utility. I’m a real fan of abstract and performance art. I love crazy paintings, sculptures and whatnot that force you to question their very existence. This is art that walks the line between brilliant and senseless.
When I look at the visualizations by Parsets, I’m inclined to print them off and stick them on my cube wall just because they’re “purty.” However, they are also quite utilitarian as every visualization should be. I’m going to show you how by using an example set of defects. Linda Wilkinson’s post last week was the inspiration for this. You can get some of the metrics she talks about in her post with this tool.
For my example, I created a dataset for a fictitious system under test (SUT). The SUT has defects broken down by operating system (Mac or Windows), who reported them (client or QA) and which part of the system they affect (UI, JRE, Database, Http, Xerces, SOAP).
Keeping in mind that I faked this data, here is the format:
DefectID,Reported By,OS,Application Component
Defect1,QA,MacOSX,SOAP
Defect2,Client,Windows,UI
Defect3,Client,MacOSX,Database
The import process is pretty simple. I click a button, choose my csv file, it’s imported. More info on the operation of Parsets is here. A warning: I did have to revert back to version 2.0. Maybe Prof. Kosara could be convinced to allow downloads of 2.0.
I had to check and recheck the boxes on the left to get the data into the order I wanted. Here is what I got:
So who wants to show me their piechart that they think is perfectly capable of showing this??? Oh wait, PIE CHARTS WON’T DO THIS. Pie Charts can only show you one variable. This one has 4.
This is very similar to the parallel coordinate plot described by Stephen Few in Now You See It and shows Wilkinson’s example of analyzing who has reported defects. She was showing how to calculate a percentage for defects. See how the QA at the top is highlighted? There’s your percentage. Aside from who has reported the defects, Parsets makes it incredibly easy to see which OS has more defects and how the defects are spread out among the components. If I had more time, I would add a severity level to each defect. Wouldn’t that tell a story.
Parallel Sets is highly interactive. I can reorder the categories by checking and unchecking boxes. I can remove a category by unchecking a box if I wish.
By moving the mouse around, I can highlight and trace data points. Here I see that Defect 205 is a database defect for Mac OS X. Although I didn’t do it here, I bet that I could merge the Defect ID with a Defect Description and see both in the mouse over.
Parallel Sets is still pretty young, but is just so promising. I’m hoping that eventually, it will be viewable in a browser and easier to share. Visualizations like this one keep me engaged while providing me with useful information for exploratory analysis. That’s the promise of data viz, and Parallel Sets delivers.
Automated Test Confessions
Sep 15th
My life as a tester is evolving and I’m feeling less like a newbie. I’ve also had yet another “James Bach” moment. This time, a friend of mine forwarded me an article her husband had read and passed along to her. He’s a developer who, I guess, is going through the whole, “unit testing: what does it all mean?” phase of life. The email contained a few links. Among them was James Bach’s paper from 1999, “Test Automation Snake Oil.” As I read through, what I now know is a classic, I realized that I’d been recognizing some of what Bach writes about in my own tests. His paper highlighted much of what I’ve come to think about my own tests.
At this point, I’ve been a software tester for about two and a half years. From my perspective, this is not a very long time. The past year, however, has been insanely intense for me intellectually and academically. There have been many times during the past year when I have felt myself back in the Interdisciplinary Studies program I took as a Freshman and Sophomore at Appalachian State University. We were given 100+ pages a night of reading per night which ranged all over the humanities and sometimes sciences. This reading was in addition to lectures and other “programming” we were expected to attend. Between the Software Engineering classes, the job as Software Tester and the runaway fascination with Data Visualization, I’ve put myself through a similar gamut of reading and working. This time my activities have centered around software, computers and testing. The result of this for my job as a software tester is that I am not the tester I was last year.
At all.
Previously, I was really smitten with HP Quality Center because it gave me structure for which I was desperately searching. This was a great improvement over the massive, disorganized and growing spreadsheets surrounding me that contained all of my test information. All of my tests could finally be organized, and thanks to the HP online tutorial I knew my tests were organized well. I felt liberated! Now I could stop concentrating on how the tests should be organized and concentrate more on the actual testing itself.
This led to the realization that there was NO WAY I would EVER be able to test EVERYTHING. I was frustrated. Why were my test cycles so short? Why did I always feel like a bottleneck? Was I not good enough at testing? Was I not fast enough? “I must find a way to test faster,” I told myself.
After attending the 2008 Google Test Automation Conference, I turned to unit testing and automation. I mean, I can write code. It doesn’t scare me at all. This doesn’t mean that I’m great at it, but I enjoy it enough to spend significant amounts of time doing it. I decided to use my coding skills to write repeatable tests that could be run over and over and over again. After all, I’m pulling my group, by the hair, towards automated builds and smoke tests have to be automated. Business just LOVES these. I was told that it was making my group look really good to have automated tests. I came out with my system test automation framework written with bash shell scripts and awk and felt so “smaht.” Never mind that I didn’t fully vet my system they way I do the system I test. Never mind that certain pieces of our system are not stable and can change drastically from one release to the next. I just knew there was a big green button at the end of the automation tunnel. I pictured myself pushing CTRL-T.
Then I started using my creation. When I realized how fragile my system was, all I could do was sigh and shake my head at several tests my system was telling me had passed even though I knew they had F-A-I-L-E-D. Not only had they F-A-I-L-E-D, they were false positives. Maybe you’re thinking, “well this must be what happened to her last year.” Uh…no. This was about three months ago.
Now that I realize the fragility of automation, I feel a weight on my back. Even worse, because this automation is perceived as such a “win,” I have fears that my fragile tests will propagate and turn into the suite of tests Bach describes in Reckless Assumption #8: tests that maintainers are scared to throw out because they might be important. I’ve also realized that while I was spending so much time on automation, there was something I forgot. I forgot that I’m supposed to be TESTING. This scared me the most. After all, if I’m not concentrating on assessing my SUT because I’m spending so much time on automating my older tests, how am I really benefitting this project?
Thus, this paper of James Bach’s landed in my mailbox during a very interesting time in my life as a tester. I feel like I’ve been through this whole evolution over the past year of realizing the power of automation, wanting to automate everything and then realizing that I can’t automate absolutely everything, nor should I. These realizations triggered an identity crisis. Am I a developer who is writing tests or am I tester who likes to develop? I decided that I am definitely the latter, and that I need to back off the hardcore automation for a bit in favor of re-examining my SUT as a manual tester.
My group has recently completed a rather large release, and we’re testing more incrementally. I have fewer features to test with small releases, so I’ve put down the automation for at least the next couple of cycles in favor of straight-up manual testing. I printed out every set of testing heuristics I could find, and have been reading through them to find the most appropriate heuristics for my tests.
What has this meant for my testing? There has been both good and bad. The worst is that Quality Center utterly breaks with this process. I am convinced that Quality Center was not designed for a human being engaged in the cognitive process of exploratory analysis for testing. (My last post was about exploratory analysis.) I think that Quality Center was designed exclusively for the Waterfall process of software engineering. To be clear: that is not a compliment. Another downside, is that I have had times when I have been looking at the screen thinking, “what’s next?”
The biggest advantage is that, of the bugs I have found, far fewer have been trivial. Once I removed all thoughts of test automation from my working memory, I have found that much more of my working memory is focused on the process of exploring and testing. I’ve been living through the observation that, “a person assigned to both duties will tend to focus on one to the exclusion of the other.”
The most memorable paragraph in Bach’s paper is at the end. He describes an incredibly resilient system of mostly irrelevant tests. That’s what I was building. I will probably be automating less, but I’m confident that the automation I write will be more relevant.
Underpants Gnomes Among Us: Exploratory Analysis for Visualization and Testing
Sep 9th
Here’s a picture of tester dog, Laika, with Dr. James Whittaker’s new book, Exploratory Software Testing: Tips, Tricks, Tours, and Techniques to Guide Test Design. It showed up on my doorstep last week, and is my first free testing book ever (thanks Dr. Whittaker!)
In reading through Stephen Few’s new book, Now You See It,I came across a completely separate perspective of looking at graphics in an “exploratory” manner. I can literally hold a book preaching the value of “exploratory testing” in one hand and a book preaching the value of “exploratory analysis” in the other. They are the same concept. If you have ever wondered what interdisciplinary means, this is a great example of an interdisciplinary concept.
Stephen Few does a great job of explaining exploratory analysis with pictures:
Half of the people reading this now understand the underpants gnome tie-in. For those who don’t get it, here’s a link to the original South Park clip (NSFW).
Jokes aside, I’m going to start with the picture, and discuss what this says to me about testing and see if it meshes with what JW’s definition of exploratory testing. I will then look at how this applies to visualization. At the end, the two will either come together or not. At this point, I’m not sure if they will. I’ll just have to keep exploring until I have an answer or a comment telling me why my answer is crap (which is fine with me if you have a good point).
Starting with the picture and testing. I’m assuming the “?” means “write tests.” The eyeball means analyze. The light bulb is the decision of pass or fail. The illustration of directed analysis looks like the process HP Quality Center assumes. QC assumes you’ve primarily written tests and test steps before testing based on written requirements. Then you test. After you’ve tested, you have an outcome.
The second line for “exploratory” analysis looks like a much more cognitive and iterative process. This says that the tester has the opportunity to interact with the system-under-test (SUT) before formulating any tests(eyeball). After playing with the SUT, the tester pokes it with a few tests (“?”). At this point the tester may decide some stuff works and keep poking or decide that some stuff has failed and write defects(light bulb.) Chapter 2 of Exploratory Testing describes how JW defines exploratory testing: “Testers may interact with the application in whatever way they want and use the information the application provides to react, change course and generally explore the application’s functionality without restraint (16).” So far this is looking very similar.
Now that I’ve looked at how the exploratory analysis paradigm applies to testing, here’s how it applies to visualization. As an example visualization, I’m looking at a New York Times graphic, How Different Groups Spend their Day. When I open this graphic, I can see that it’s interactive, so I immediately slide my mouse across the screen. I notice the tool tips. Reading these gets me started reading the labels and eventually the description at the top. Then I start clicking. The boxes on the top right act as a filter. There is a also a filter that engages when a particular layer is clicked.
Few’s point in describing directed analysis vs. exploratory analysis is that in the wild, when we look at visualizations, we use exploratory analysis. It’s not like I knew what I was going to see before I opened the visualization. Few describes the process known as “Schneiderman’s mantra” (for Ben Schneiderman of treemap fame) in more detail saying that we make an overall assessment (eyeball), take a few specific actions (“?”), then reassess (eyeball). Although Few doesn’t say that there is a decision made at some point in this process, I’m assuming there is because of the light bulb in the picture (84).
Recently, Stephen Few asked for industry examples of people using visualization to do their work. Some of the replies were from the airline industry, a mail order warehouse and a medical center. Software engineers should be included in this mix and apparently from page 130 in JW’s book showing a treemap of Vista code complexity, already are. Given that both use the same form of exploratory analysis, I can see why.
Exploratory analysis of software testing and visualization diverge, however, when you look at the scale of data for which each is effective. Visualization requires a large dataset. This could be multiple runs of a set of tests or, as in JW’s example, analysis of large amounts of source code. Exploratory testing as JW describes can occur at a high level such as in the case of a visualization or at the level of an individual test.
One thing my exercise has shown me for sure is that I have to read more of Exploratory Testing.





![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=41f1664d-b55f-43c9-9229-d57fae8e64f0)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=ea549488-64ee-4cd3-bc1d-41c3d68dd0c2)

![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=c65ab5b3-a838-4144-b49c-2455d201ee13)

