Marlena's Blog – Page 15 – Write — Test

The Observer Pattern

: Image via Wikipedia

This pattern is also known as publish/subscribe. It’s mainly used for keeping objects updated. The object sending out updates is called the “Subject” and the objects receiving updates are called “Observers.”

As a one-to-many example, think of Little Red Riding Hood (LRRH…she recently converted to Scientology. It was on TMZ). She’s setting up Grandma’s RSS feed for the very first time.
Grandma only has 1 feed, from her favorite televangelist, Mozelle Patterson. Grandma only has one feed because after LRRH set it up for her, she’s probably not going to go looking for more.

Mozelle Patterson is a televangelist here in the ATL who has a show on a local channel every Saturday that can, at times, be more amusing than SNL. Let’s say Mozelle Patterson has an RSS feed. Her feed is the subject. All the grandmas who have had someone set up their RSS feed, get the Mozelle Patterson feed. All of the grandmas, including LRRH’s, are the observers.

**subject**

Mozelle

** observer**
-Little Red Riding Hood’s Grandma

**observer**

-Sleeping Beauty’s Grandma

**observer**
-Anansi’s Grandma

Notice that, previously, I said that this is the 1 to many example. Why? Because the next time LRRH visits, she notices that Grandma hasn’t added anymore feeds, but she knows that Mozelle has a special feed specifically for the Gospel music featured on the show, and she adds that to Grandma’s reader. Now, Grandma has 2 different types of feeds from the same place. That’s many to many.

Here’s a class diagram of the 1-to-many example. Note: this is my example, and it’s not guaranteed to be perfect, so don’t take it as “gospel.”

Push/Pull
There are two types of Subject/Observer relationshiop. In the “Push” relationship, the Subject will send the observer everything it needs on notify. For example, If Mozelle’s rss feed contained all of the highlight from the show with pictures, etc., this would be a push. If the feed only contained the title of the show, and Grandma had to click through to view the show highlights, then it would be a pull relationship.

Loose Coupling
This pattern is a great example of loose coupling, because the subject doesn’t know that much about its observers. The observers operate freely and independatly of their updates from the Subject.

Design Pattern Series

I’m taking Design Patterns this Summer. At this point, I’ve coded enough Java to be able to say I can code Java. However, I also know that my code typically is not, ahem, polished. Take for example, this past week. We’ve been going over publish/subscribe. Some may call it the “Observer” pattern. I call it the distributed message queue system I made for distributed systems class a year ago. Hindsight is always 20/20, right?

Design Patterns is all about avoiding DRY. This is something that I’m going through even with the Automated Test Framework that I’m writing at work. It’s all Unix shell scripts. Since shell scripting was created when programming pretty much meant that you were programming the OS, the syntax itself wasn’t particularly intented to be OO (insert e-vil grin here)…but that doesn’t mean that it can’t be. At this point, I’ve got scripts everywhere, and they are settling into a pattern. I chose test patterns and test harness patterns as my research topic for the class. I’m waiting to see if, in the course of my research I find myself looking in the test patterns mirror at my own automated test framework.

In the meantime, I will be making a few posts this Summer about Design Patterns. I’ll post my research assignment too. I’m so happy that my teacher is using Head First Design Patterns. I know that THE Design Patterns book is Design Patterns: Elements of Reusable Object-Oriented Software, typically known as the “The Gang Of Four” book, but the information in Head First Books is always so much easier to remember. They remind me of The Best Calculus Teacher Ever, Dr. Kathleen Hall who was considered somewhat unorthodox in using colors, YES! colors for describing calculus concepts on the black board. Every time I read a Head First book or find myself taking the derivative of something, I think fondly on Dr. Hall and her classes.

Testing in a Throwaway Culture

: Image via Wikipedia

When was the last time that a prized possession of yours broke before its time? Did it make you angry and disappointed? Were you surprised or were you half-expecting it to break?

Craftsmanship is a word we no longer associate with many of the things that come into our possession. This was brought to my attention recently when I had to buy a new motor for my very pricey KitchenAid, Architect II dishwasher. As software quality professionals, we are all on the other side of this. How many tests were you able to run? How well did you really vet that app? Did you understand the app? How much of your testing went according to plan, however much planning you had? Did the plan really matter anyway?

Last week, a good friend of mine wrote about his frustration at not having enough time to execute tests because of other test activities such as shaking out requirements, managing others, etc. I kind of know how he feels because, as the test army of 1, I am responsible for many of the same activities. I’ve done all sorts of reports and activities that will pad my resume as a QA resource, but, in the end, this is not why I do the job that I do.

Here is a post from Chris McMahon’s blog that is, in contrast, ALL ABOUT why I am very content as a technical QA. The utter hack-itude of the exploits described in this post are exactly the domain of the tester I try to be every day. But then, I have the bug reports to fill out, the test planning to create and the inevitable smoothing over of dev ego. These things slowly but surely chip away at my day. My friends blog is a description of how, for more senior test professionals, it becomes their whole job, and my friend isn’t the only tester I’ve noticed lately opining the strategy tasks that take up their time ( you know who you are).

We live in a throwaway culture where breadth is valued way over and above depth, and it seems, to me, that this can heavily influence our careers, sometimes for better and sometimes for worse. This includes software development AND QA. I’ve worked in this type of environment, not as a tester, as a CM. I noticed that for every role, CM, tester or dev as soon as people became technical experts at what they were doing, they were expected to start managing whether they wanted to or not, whether they made a good manager or not. Am I right or am I right? What’s missing here is an association depth with value both on the technical side and the management side.

What does this mean, specifically, for testing? What does it mean to be an expert craftsman in testing? Does it mean that I can switchblade an app with heuristics, any time, any place? Or does it mean that I will find a way to make some assessment of quality if given the most mountainous of systems to test in extremely adverse conditions? My personal goal is to work hard at both. I use test management activities mainly as a way to manage DRY (do not repeat yourself) and to get on with the tests. It’s almost as if there is a sliding scale with test execution at one end and management of test activity at the other end. This seems a rather one dimensional approach, and careers are not one-dimensional.

When you are asked to stop testing so much and to start managing more, what will you say? Are you ready to give up depth as a tester and increase your breadth as a manager? Is this really a one-dimensional issue? For some, and maybe even for me at some point, this can be a great decision. In some places, maybe there is less of a tradeoff than what I’ve seen. For some, participating more in the management process might mean better quality for an entire team. If the entire team improves, maybe the software will break less. If the team is testing KitchenAid dishwashers, maybe the dishwashers will break less, and I won’t have buyer’s remorse for my fancy kitchen appliances.

Love it? Hate it? Comments are always welcome.

The Game is Afoot: Abstract Accepted for PNSQC

: Image via Wikipedia

Yesterday, my inbox had a lovely note from the PNSQC folks saying that my abstract is accepted. Immediate freaking out and some really bad cube-dancing commenced. I was already committed to turning out a really great thesis over this summer, and now I’ll be turning out a really great thesis that people will possibly read! The conference folks still have to approve my final paper, so my presentation will be pending their approval. I was looking a little more thoroughly at the conference web-site. It turns out that even if your paper is accepted, your peers will GRADE your prezo with a red, yellow or green card. Jeez!

All freaking out aside, what a great validation of my topic. Any tips for presenting or paper writing? God, I love Portland!

Picasso Ate My Metrics Paper: Visualizing Software Metrics with Treemaps

This past semester, one of the classes I took was a class about Software Metrics. I was required to write a paper, so I wrote about visualizing software metrics. All in all, it was a pretty intense semester. I’ve been settling on a masters thesis topic, which you can read a little about in my previous post. I’ve written lots of Processing code and been reading through Edward Tufte’s books. I attended his seminar and gave my own seminar, at work, about Data Visualization.

I guess my artistic side broke through this past semester and demanded my full attention. Reading through my posts, you would never know that there was a time in my life when I did lots of painting and drawing. The painting below is a reproduction of a Picasso I painted for my mom during this period. Last week, I finally broke down and ordered the Adobe Design Premium suite which includes illustrator and Flash. Yay for educational discounts!

picassodemom

This semester has been all about my artistic impulses and my obsession with technology having a full-on, stay-awake-late, grab-the-bull-by-the-paintbrush collision. The days when I was writing about visualization and software for job, school and pleasure all at the same time made me smile and think, “it’s good to be me today.” I know that most people don’t get even 1 day of that and it means that I am finally, after 9 years of higher education, going the right way with my studies.

I’m posting my paper here. It seemed prudent, before delving into research on visualizing test data, to see what was already out there. What I found was an emphasis on craziness, a disregard for human-computer interaction and any principles of data visualization. The glaring exception was the body of work on Treemaps. If you look at the references of the original paper on Treemaps (I can’t post it because of ACM), you will find a reference to The Visual Display of Quantitative Information. For this reason, I’ve pretty much stuck with Treemaps in my research.

Are treemaps useful for visualizing the quality of a software system? I will be working over the next few months to answer that question, and smiling a lot in the process.

Submitted an Abstract to PNSQC

I’m posting the abstract I just submitted to PNSQC. It’s also the abstract of the thesis I’m writing for my Masters. I’ve submitted a poster to the Grace Hopper Conference, but never before have I submitted a full-on paper requiring a full-on presentation. I chose PNSQC for 2 reasons: the focus is more on the practical side, unlike some of the ACM conferences and the conference is in Portland, Oregon. God, I love Portland.

Anyway, here is what I submitted:

Visualizing Software Quality

Moving quality forward will require better methods of assessing quality more quickly for large software systems. Assessing how much to test a software application is consistently a challenge for software testers especially when requirements are less than clear and deadlines are constrained.

For my graduate research and my job as a software tester, I have been looking at how visualization can benefit software testing. In assessing the quality of large-scale software systems, data visualization can be used as an aid. Visualizations can show complexity in a system, coverage of system or unit tests, where tests are passing vs. failing and which areas of a system contain the most frequent and severe defects.

In order to create visualizations for testing with a high level of utility and trustworthiness, I studied the principles of good data visualizations vs. visualizations with compromised integrity. Reading about these lead me to change some of the graphs that I had been using for my qa assessment and to adopt newer types of visualizations such as treemaps to show me where I should be testing and which areas of source code are more likely to have defects.

This paper will describe the principles of visualization I have been using, the visualizations I have created and how they are used as well as anecdotal evidence of their effectiveness for testing.

First Attempt at Visualizing Tests and Defects

This post is about a visualization I created to show test execution status with related defects using data obtained from HP Quality Center. I’m using Excel to create this chart and deliberately stayed away from “fancy tricks.” If you want to recreate this some steps will be different if you don’t use Quality Center or Excel. In that case, you get to figure it out.

My weekly status meeting drove me to create this chart. Every week, I sit in this meeting with some pretty important people. Whenever I’m in a testing phase, I have to show what it is I’ve been working on. Previously, I’ve used the report templates from Quality Center, but they really are crap. Wait, that’s not big enough…They really are C-R-A-P. Not only is it difficult to jostle the correct data into place, but they are ugly and none of my superiors actually trusts the Quality Center visual. Since I’ve been doing all this reading about data viz, I realized that I could easily create something much better using Excel in less time.

There are several steps to creating any visualization, and I’ll break down the creation of this chart into the following steps. They are from Ben Fry’s excellent book, Visualizing Data: Exploring and Explaining Data with the Processing Environment.
Acquire – getting your data
Parse – structuring, and sorting your data
Filter – remove stuff you don’t need from your data
Mine – apply mathemagic in the form of statistics, data mining, whatever to show patterns in your data
Represent – choose the type of graph or chart you will be using
Refine – polish your chart so that it has clarity and makes people want to look at it
Interact – add ways for the user to explore your chart

Acquire
Looking at your test set in quality center’s test lab, right click, and QC will show you an option at the bottom for “save as.” Click and this and choose excel sheet. Html would work too since excel can parse through QC’s html.

Parse
For this chart, I like to create 3 groupings: passed, failed and no run. You can have other groupings, but you probably want to make sure that the “other” groupings are together. In Excel, I sort the data by status into the 3 groups. If you are using a test set from the past, you might only have 2 groupings.

Filter
This will change depending on how large your team is and how the work is divided. Since I’m the only tester, we all know who executed the tests in my chart. I remove the columns for Attachment, Planned Host, Responsible Tester, Execution Date, Time and Subject. This leaves only 2 columns, Test Name and Status.

Mine
No mining in this one, it’s just straight up data.

Represent
This chart will be using a stacked bar format which looks similar to a stemplot, but isn’t really a stemplot.

Refine
Here’s where the magic is really happening. See that column for status? Add a column between “Test Name” and “Status.” Each cell in the new column gets a color according to that test’s status: Passed=green, Failed=red, No Run and anything else is gray. If you use the basic colors in Excel, your colors will be too bright. In Excel 2003 you can change this by navigating to Tools>Options>Colors. Here you can choose some less saturated versions of red and green. You’ll still need the very bright red, so don’t save another color over it.

Use fill color to change the colors in the empty column next to the status column. Now you can see each test and you can see how much is passed and how much is failed. Since the status is now indicated by color, you can get rid of the column with the status as text. If you’d rather keep it, you could have one column with the fill and the font color set the same. This would mask the text of the status inside the cell.

Gridlines can be very distracting, so clear all of them. To do this navigate to Tools>Options>View>Window options and clear the Gridlines check box. Doing this should immediately relax your eyes when you look at the chart.

To make it obvious which test has which status, right justify the column for test name.

If you want to show defects on your chart, you can do this using color. Remember where I said not to save over the bright red? If you have defects that you need to show on the chart, change the color of the related test case to bright red and add the I.D. and title of the defect to column on the right of the status. This works when you typically have 1 defect per test case. If you have more, I would just color that test bright red, and have a list of the defects elsewhere. Showing defects this way highlights the need for accurate and concise defect descriptions. I was reading in How We Test Software at Microsoft that their testers work very hard at creating good defect descriptions. In fact, a friend of mine had a great post that received excellent comments about this. Here is where the defect description can make a big difference. Ditto the test names. Everyone in the status meeting sees these and asks questions.

Edward Tufte would probably say that I should print this chart out on a really big piece of paper, but he doesn’t work in my group. My status meeting is paperless, so this will be displayed on a big fancy conference room flat screen, ‘cause that’s how we roll. Actually, I’m not joking about the rolling part. The person who leads the meeting has a tendency to compulsively scroll up and down, so I designed my chart to be displayed within the width of his laptop screen. That’s why the summary is out to the side. Also, I made the text of the title and the summary much larger than the font of the individual tests. I’m using Trebuchet MS for the font.

Interact
At the most basic level, this chart allows the user to interact with the chart by focusing on the smaller, individual tests if they choose to do that. They can also look at the defect information if they would rather. As an improvement, I might figure out how to add some information for each test when it is moused over.

This chart probably won’t scale if you have multiple testers, multiple projects or copius amounts of tests and defects. I know that’s most testers, but I really made this chart specifically for my needs. This shows the very busy and important people what they need to see in a way that they can trust.

As always, comments are welcome. Love it, hate it? Let me know. I’m still learning about this stuff and welcome the feedback.

Integrity in Data Visualization: Part 2 of 2

In the second part of this series about data visualization (Part 1 is here), I will show how, according to Edward Tufte, information visualizations can trick the viewer. Since I read through this chapter in his book, The Visual Display of Quantitative Information I’ve noticed a compromise of graphical integrity in the most surprising of places. Some of these are included in my post as examples.

Currently, I’m writing a paper for my metrics class on software visualization. After the class, it will be expanded to include how software visualization can work for testers. Part of knowing how to use visualization for any project is understanding what constitutes a good or bad visualization. I’m guessing that if you found my blog you might be a tester, and in that case, understanding the basics of data visualization will help you understand where I’m going with some of my upcoming posts. Outside of testing, understanding complex visualization is a skill we all need to have because we live in an age of data.

Labeling should be used extensively to dispel any ambiguities in the data. Explanations should be included for the data. Events happening within the data should be labeled.

For this example, I’d like to use a graphic that was in a post on TechCrunch as I was in the process of writing this post. It was posted by Vic Gundotra who is VP of Engineering for Mobile and Developer Products at Google. You can read his full post here. I’m calling out a couple of his graphics because I was pretty shocked that he would post these. I’ve seen some of his presentations on YouTube, and they were awesome presentations. Using graphics as bad as what he posted on TechCrunch only diminishes an otherwise strong message and will make me think twice about any visuals he presents in the future.

Here is the first of Vic Gundotra’s graphics. Notice how he does not label the totals on the graph, but separately at the bottom. There is no way a viewer can compare the data he’s representing.

In times series graphics involving money, monetary units should be standardized to account for inflation.

Ok, this might be a controversial example but please read the explanation before flaming me. I freaking love this award winning interactive graphic about movies created by the storied New York Times graphic department…BUT I have a bone to pick with it. The graphic shows movies from 1986 to 2008 and it does not account for inflation. There are some other things going on with this graphic as well, but since it’s about movies and not our budget crisis, I give it’s lack of adjustment for inflation a “meh.” It just goes to show that nothing is perfect.

All graphics must contain a context for the data they represent.

Here’s another Vic Gundotra graphic. Notice how there is no total for the number of users represented, yet Gundotra is trying to say that 20 times more are using the T-Mobile G1. By not including this number, Gundotra is not providing an accurate picture of how many people are using either. It could be 20 people using the G1 or it could be 20,000. There’s no way to know. The fact that he didn’t include this number says, to me, that maybe the number of people is embarrassingly small, but that’s another post for another blog.

Numbers represented graphically should be proportional to the numeric quantities they represent.

This is all about scale in graphics. If you are looking at a graphic, the pieces you are looking at should be to scale. Tufte actually has a formula for what he calls “The Lie Factor.” This link has a couple of illustrations and also shows how this formula is calculated.

Number of dimensions carrying information should not exceed dimensions in the data.

You know all of those 3d pie chart and bar graph templates in the Microsoft Excel chart wizard? Don’t use them anymore, and yes, I’ve used them myself in the not-too-distant past. They qualify as “chart junk” from Tufte’s perspective.

Variation should be shown for data only and not the design of the graphic.

I looked but couldn’t find a good example of this on the web. (If you see one let me know.) One of the graphs that Tufte uses to illustrate this point shows a bar graph where the bars for years that are deemed, “more relevant,” are popped out in a separate larger section using a really heinous 3d effect. It’s on page 63.

As always, comments are welcome.

Bug Titles: Points to Remember

When creating a defect report, the title can be very important as it is sometimes the only part of a defect the developer will genuinely process. Not only that, but in Quality Center, the title of a defect is what gets emailed around. In his book, How We Test Software at Microsoft, Alan Page has some pointers regarding what to title a defect report.

He reiterates the importance of a good title and says that, when scanned as a list, bug titles can form an overall picture of a systems defects. Some of the particulars in creating a good title include limiting the number characters to about 80. That’s a little more than half of what you get in Twitter. Apart from that you need to walk the fine line of being descriptive, but not overly descriptive.

His example of a good bug title is, “Program crash in settings dialog box under low-memory conditions.”

The description is for any notes that don’t fit in the title. So, if you are including actual and expected behavior, that would go in the description and not the title.

What is Data Visualization: Part 1 of 2 Characteristics of Excellent Visualizations

In this post, I will be answering the question, “what is data visualization” and writing about some of Edward Tufte’s principles of for “excellent” data visualizations. This can be an aid in creating better graphs or in looking at graphs. In a subsequent posts, I will relate these fundamental principals to visualizations for use in software testing.

In his first book, The Visual Display of Quantitative Information, Tufte outlines several principals for use in the creation and interpretation of quantitative graphics. If you get the chance, I highly recommend flipping through it. If you have questions about the statistics concepts, you might want to look at Head First Statistics by Dawn Griffiths. I’ve been hitting this book up regularly especially for the metrics class I’m currently taking.

In the comments of my post “Exploring Data Visualization,” Eric asked me, “what is data visualization?” When I say data visualization I’m talking about a graphical depiction of statistical information that tells a story. These depictions can be simple or more complex, and they all have a point they are trying to make. According to Edward Tufte an excellent visualization expresses “complex ideas communicated with clarity, precision and efficiency,” (13).

To illustrate this have a look at one of my favorite interactive web graphics. “A Year of Heavy Losses,” from The New York Times. It illustrates the change in market capitalization of banks from 2007 to 2008. Be sure you click on the square at the top left to see the change. You can see not only the number of banks dwindling, but also their capitalization in the market. You can also mouse over each bank to see more granular data.

According to Tufte, these are some characteristics of excellent visualizations:
1. Lots of numbers packed into a tiny space
2. Data represented is not distorted
3. Extremely large data sets have coherency
4. Comparison between different pieces of data is easy
5. Data is revealed at a micro level and at a macro level
6. The data’s purpose is clear
7. Integration between the statistical and verbal descriptions of the data is tight

Here is an illustration Tufte uses as an example:

It is a French train schedule from the 1880’s. Take some time to look at it and understand it, then look back at the characteristics I have just listed. Did you notice how the cities on the left are not listed at regular intervals? This is because Marey spaced them apart proportionately to their actual distance from each other. Since he did that, when you look at the slope of the lines, you are not only seeing arrival and departure times, but also the relative speed with which the train will get you from one place or the next. If you depend on trains to get you from one place to the next, this can be very important information.

This graphic also illustrates the concept of multivariate data which, according to Tufte is also a quality of excellent visualizations. I’m going to break out what’s in the train illustration into univariate, bivariate and multivariate data. If I miss something, just add a comment.

Let’s start with the concept behind this illustration. It’s depicting arrival times and departure times of trains in France. It shows the route the trains take, and the relative speed with which they make from one station to the next.

Univariate data shows the frequency/probability of one variable.
Some univariate data from this graphic: the number of trains arriving or departing a station. The number of trains arriving at stations at any one time. The number of arrivals at a station each day. The number of departures from Chagny station each day. Each of the variables I have described is a frequency (Head First Statistics 609).

Bivariate data shows 2 values for an observation.
Bivariate data from this graphic:
(x) Time of day
(y) Number of trains arriving/departing at Chagny station

For this observation you need two variables(Head First Statistics 610).

Multivariate data shows multiple values for an observation.
If we take the observation from the bivariate data example and add stations, the observation becomes multivariate and is what you see in Marey’s illustration.

I’ve just covered a lot of material and I hope it gives you a good idea of what data visualization and the field of information visualization is all about. In my next post, I’ll be covering the ways in which graphs can lie. I’ve seen this happen at work and just completed a reading assignment for school where it was also an issue. These are complicated topics that software engineers should understand if they are to use visualization in ways such as a tester’s heads up display.

Questions and comments are always welcome.