The Problems With Acceptance Testing
February 26, 2010
Gojko Adzic wrote to ask about my experiences with acceptance testing. For those who don't know, I was the coordinator of the Fit project for a while and co-author of the C# version. Fit is Ward Cunningham's brainchild and (as far as I know) inspired a lot of the current Agile acceptance testing tools, such as Selenium, Cucumber, and FitNesse. FitNesse, in particular, started out as a fork of Fit.
When Ward first introduced me to Fit, I was very excited about the possibilities. I embraced it wholeheartedly (as evidenced by my participation in the project) and introduced it to my consulting clients. Over the years, my enthusiasm dimmed. It just didn't work as well in practice as it was supposed to. Here's what I had to say to Gojko:
My experience with Fit and other agile acceptance testing tools is that they cost more than they're worth. There's a lot of value in getting concrete examples from real customers and business experts; not so much value in using "natural language" tools like Fit and similar.
The planned benefit of those tools is that the customers are supposed to write the examples themselves, thus improving communication between customers and programmers. In practice, I found that customers (a) weren't interested in doing that, and (b) often couldn't understand and didn't trust tests that were written by others. Typically, responsibility for the tests gets handed off to testers, which defeats the whole point.
Furthermore, acceptance testing tools are almost invariably used to create end-to-end integration tests, which are slow and brittle. Fit works best for targeted tests that describe the domain, but that's not how it's used. Also, tools like Fit don't work with refactoring tools. Once you have a lot of tests, they become a real maintenance burden.
These two problems--that customers don't participate, which eliminates the purpose of acceptance testing, and that they create a significant maintenance burden, means that acceptance testing isn't worth the cost. I no longer use it or recommend it.
Instead, I involve business experts (on-site customers) closely throughout the iteration. I do have them create concrete examples, but only when faced with particularly complex topics, and never with the intention that they be run by a tool like Fit. Instead, the programmers use those examples to inform their test-driven development, which may or may not involve creating automated tests from the examples, at the programmers' discretion.
I also have the on-site customers conduct ad-hoc review of the software as programmers complete it. They pair with a programmer, look at what's been created, and ask for changes that the programmer implements on the spot.
Further Reading
If you'd like to know more, I highly recommend listening to Scott Hanselman's "Is Fit Dead?" podcast. In this podcast, Scott interviewed Ward and I together. We talked about our goals for Fit, what actually happened, and passed the torch. I'm very happy with how this interview turned out.
That podcast kicked off an excellent discussion about Fit on Twitter. Eric Lefevre-Ardant wrote a great summary.
Gojko Adzic, George Dinwiddie, and Ron Jeffries each posted thoughtful reactions to this essay on their blogs.
I've posted a follow-up essay describing Alternatives to Acceptance Testing.