A phrase I hear a lot around Mozilla is “continuous deployment.” I hear there’s this product Mozilla makes that’s competing with some other product that has rapid release cycles. So, yeah, we’re working on continuous deployment.
I’ve noticed that a main resource around our office for information about continuous deployment is this video from Etsy. Hearing, “We’re moving to continuous deployment,” is nothing new for me. This is the 2nd job I’ve had where it’s been a major focus. Since I’ve heard of the Flickr version, I decided to watch this Etsy video.
Picture yourself at your computer about to hit the big button and deploy a feature you’ve been working on. You are fairly confident that nothing catastrophic will happen, but you don’t know. (I’m writing this from a dev perspective, but even if you’re a tester…come on…you never know, even if you’ve tested the hell out of something). In the talk, this is what is referred to frequently as, “the fear.” It’s actually referred to as either, “THE FEAR” or “the fear.”
“Fear for startups is the biggest no-no.”
“Fear is what keeps you from deleting your database.”
“Fear doesn’t go with creative work.”
This rings true for me because I frequently deploy selenium tests for addons.mozilla.org. My teammates and I have talked about “THE FEAR.” We have strategies for coping with it such as holding one’s breath, saying a prayer or running the 90+ tests one more time. When Etsy talks about “The Fear” I know exactly what they mean.
Etsy’s video fascinates me because of how they have conquered “The Fear.” It’s been on my mind every day since I watched the video. What’s the special-continuous-deployment-sekrit-sauce-that-makes-everything-all-better?
Etsy combats “the fear” with visibility. You see, at Etsy, EVERYTHING IS GRAPHED ALL THE TIME.
Here are some of the things they mentioned graphing in the video:
How many visitors are using this thing?
Can we deploy that to 100%?
Did we make it faster?
Did I just break something?
How long is it taking to generate a page?
How many users are logged in?
How is the bandwidth?
What’s the database load?
What’s the requests per second?
If you look at the graphs, they are simple bar or line graphs. They are not exceptionally fancy but they are numerous and the maintenance admittedly takes work. They are not, however maintained by specialists working in a silo. The graphs are created by an engineer. Here are some numbers:
20,000 lines/second is their log traffic, at times
16,000 is the number of “metrics” they have organized through dashboards
25 engineers committing code to dashboards
I doubt that when Etsy decided to start graphing everything they woke up one day with 25 dashboards. It sounded very much like they put the tools in the developers hands and lovingly nudged them along.
This is a serious commitment to data.
Data doesn’t just happen. It takes a persistent effort to include log messages in your code. It takes servers and databases capable of handling the traffic created by the log messages and staff to maintain them. It takes investing in huge monitors all around the office and giving people the bandwidth to figure out how to work with the data & graphics stack. Most importantly, it takes trust so that employees are allowed to see the data without making them jump through hoops.
So how can a team move closer to the graphing part of continuous deployment?
According to Etsy:
- Give people access to production data — without making them wait months for a special password or even log in every time.
- Make the data real time instead of daily. When I say access, I mean feeds. This goes well beyond a spreadsheet.
- Create copious amounts of log messages. If someone clicks a link, goes full screen or downloads something…log it.
- once you have the data, make graphs for features before you release them
I love data, but will be the first to admit that it is not pretty. The plain truth about data is that it takes patience because combing through and refining it can be tedious, monotonous work. It is very easy to buy a bunch of monitors and put them on a wall showing an inst-o-matic graph that came with your bug tracker (I’ve seen this done. O hai, expensive wallpaper!). It takes more time to ask deeper, meaningful questions. It takes even more time to filter the data into something graph-able. After that, you have to find the right way to share it. Note, that even if you do all of this and the data successfully tells a story, you’ll have to spend time dealing with, “and why did you use those colors.” What was I saying? Oh yes, data is not pretty.
Now that I’m working every day with tests I visualized a couple of years ago, I’m continuing my quest for deeper questions about tests. In my context, the tests are the selenium tests I work with day in and day out, so besides coming to grips with “THE FEAR,” I’ve also been thinking about, “THE FAIL.” But wait! That’s another blog post.
If you want to read more about Etsy’s graphs and data, they have written their own post about it.