Thing I Have Learned About Software Testing

2011-08-29 by qntm

Firstly, it's a thing. I didn't even realise this at first.

I was bred on a diet of console videogames and I knew in my soul that playing games was the best thing ever. Being paid to play games was obviously even better than the best thing ever. Being a professional videogame tester was, logically, the greatest profession possible. You get to play great games before anybody else, right? And you get paid for it! There was no lightning flash when I realised that being a professional videogame tester means playing great games before they are actually great (before, in fact, they can even be described as "games", or "playable"). It was just a realisation that I eventually came around to. If you need a lightning bolt, Here's Your Reality Program. Videogame testing is tiresome and unfun. Any videogame becomes tiresome and unfun to test, after ten months.

Later, when I started coding as a hobby, my typical development process was to write some code, run the program and see if it failed in the expected way, write some more code, run it again, and finally write the last bit of code and see if the program succeeded. I'd never built a computer program that was too large for me to contain its entire complexity in my head. If something went wrong, I hunted down the bug and fixed it and ran the program again and it worked. It didn't occur to me that:

a piece of software can be so large that a glance at its output isn't a firm assurance that it's working
a piece of software can be so large that it can't be exhaustively tested
a piece of software can be so complex that it has defects that the developers can't immediately fix, or even immediately understand
a piece of software can have so many defects that there isn't time to fix all of them.

And here's a second lightning bolt (and this actually was a real revelation for me). Manual testing via the user interface is the tip of the iceberg. Testing is in general too complex, too deep, too large a task and too fiddly to be performed manually (except where absolutely necessary: GUI testing, usability testing, translation/localisation/internationalisation). It can and must be automated as far as possible (but no further). Automated testing provides an assurance that the tests are being performed correctly and that the results - pass or failure - are reliable. It lets large collections of subtly distinct behaviours be checked quickly and without uncertainty.

There can be a lack of appreciation for the software tester.

The preconception that software testing doesn't actually exist as a separate role/undertaking from software development - or sometimes, the lack of any conception that software testing exists as a concrete concept - is real. I occasionally hear stories about organisations which don't treat testing as significant (these are usually quite small organisations; it's difficult to become larger without testing the thing you're releasing). More frequent is a situation where testers are kept in a separate box from the developers, with testing treated as a totally separate concern from development, to be performed after all development is nominally completed. Even if testing is taken seriously, testers can be regarded as second-class citizens to developers. (Although, spare a thought for the documentation people, who typically exist at rung three on this ladder.) This is predominantly due to the following single, basic, unfortunate fact:

Software without testing is still valuable. Tests without software are worthless.

Specifically, good software is still good software, even in the absence of robust testing to prove it. A robust suite of automated tests, however, is of no value until there exists software to test. At the end of the day, if you have to delete the latest build of the software or the latest build of the test material, only one of them leaves you with an avenue of possible income the following morning.

I don't work in that kind of place; I work in a place where about half the budget and about half the people are allocated to testing, and where testing is a mammoth undertaking considered, by every level of management, to be core to the release process. It's nice, and I pity those in less fortunate circumstances.

The purpose of software testing is to provide an answer to exactly one question:

Can we release it?

There are three Answers to the Big Single Question: "No, because it hasn't been tested properly yet"; "No, because it has been tested and found wanting"; and "Yes".

In other words, testing creates nothing concrete. It just provides a recommendation. Creating and running tests means I'm not contributing to the user-facing output of the department. There's less pride in my work. There's less of a feeling of accomplishment when a million people end up using the thing that I made sure was good than there would be if they were using the thing that I made. It's even worse when they're using something that, I know for a fact, isn't as good as I wanted it to be.

This is not so much of a problem when people respect the Answer. That comes partially from a sound culture of testing (check-- see above) and partially from personal authority: the ability to consistently produce robust tests, reliable metrics and well-hidden defects (also check).

There are times when it feels like I'm holding everybody back by uncovering defects in other people's work. Everything I say slows the project's work down even further. This is what I think people are thinking: "We'd be done by now if you'd stop complaining about nitpicks and broken things! So what if the thing can't handle a directory name with a space in it? Stop being such a perfectionist." The software tester is the goalkeeper of software engineering: the only reason anybody pays attention to me is when something has gone wrong.

However, this is a problem of attitude. It's also a bad choice of metaphor.

Unlike a goalkeeper, a software tester holds absolutely no responsibility for bad figures; only for incomplete or unreliable testing. So the thing keels over in a highly specific and unlikely edge case? Who introduced the code which sunk that build? Developers. Who's gotta fix it? Developers. Whom am I doing a favour by finding this stuff out early? Everybody in the entire company. I'll describe a famous graph. It measures the dollar cost of fixing a defect once it is discovered at various stages in development. The figure increases by a factor of ten at each stage from unit test, functional test, system test, ship test up to "user encounters issue in the field, raises helpdesk issue, cancels contract, switches to competitor's product". Good software is still good software, even in the absence of testing; but bad software is also bad software, whether tests have found the badness or not. The defect was always there. If I found it and raised it early, I did my job. I covered the project's collective backside.

Robust testing separates the adults from the children. It separates software on which we can build a business from software we can't give away.

Software testing begins at design time. I want to be at every development-related meeting from conception onwards. Why? One: the earlier testing begins, the more I can get done. Purely test-driven development is impractical or even impossible in many scenarios, but some tests (and certainly many test plans) can be constructed before a line of code is written. And frequently a single line in the feature, like "and we'll email the result to the user", can double the amount of testing required.

Two: the better I understand what we're trying to do, the more robust the tests will be. Developers can't be trusted to clearly explain what was decided after the fact. That's just Chinese Whispers. Developers change their minds about what they're going to do while they're doing it. They don't document these changes and frequently don't inform anybody about them - unless there's somebody at their elbow trying to keep up and providing continuous feedback as to how broken their code is. And developers think about software in terms of how they're going to build it. Testers think about software in terms of how they're going to break it. Both viewpoints are critical, but typical users more closely resemble the second picture.

Three: I can advise ways to make the software easier to test.

Testing cannot be boxed out separately from development. They are inextricable from one another; they must proceed simultaneously.

Software testing is hard.

There is a serious technical challenge here. For starters, a battery of automated tests can be as large as, or larger than, the software under test (this is perfectly permissible as long as the complexity of the test infrastructure is substantially lower). It can even be complex enough to require automated sanity tests of its own. It almost certainly has to be as portable as the software is. In my department that amounts to about 60 distinct hardware/OS environments, 80% of which I had never heard of before joining the company.

Some behaviours of a piece of software can be extremely difficult to reach and test, and some can be, for practical purposes, unreachable and untestable. Even when this isn't the case, there is never, ever enough time to test everything.

Nor is there ever enough time to fix everything. Our project is so big that a certain (albeit impressively low) number of defects is permissible in a final release. This goes against every instinct I have. It's software, damn it. It's mathematics that runs on a machine. There's no excuse for shooting lower than perfection.

I have no problem with defective builds. That's the natural state of a build. Largely defect-free releases are statistical flukes. But sometimes the latest build has so much wrong with it that I can't even run all of the tests that need to be run in order to find everything that's wrong with it. And knowing that the build is defective is cause for concern, but not knowing how defective the build is is alarming.

And when unknowably-defective builds seem to appear on a regular basis, and when defects are being uncovered in perfectly mundane no-brainer features, I start to wonder what else is broken that I won't ever have time to test.

Thing I Have Learned About Software Testing

Thing I Have Learned About Software Testing

Recommend

Chess: THE MOVIE

Password security in Deus Ex

Review: Oblivion

A technical question about the qntm.org URL schema

Redeeming Quantum of Solace

Review: Star Wars Episode VII

Deus Ex Ultimate Run walkthrough

Hey look, I'm on Drabblecast

How many infinities are there?

Review: Gravity

About Joyk