John Fremlin's blog: Bad unit tests impart a false sense of security

Waiting for updates: connected

Posted 2016-06-21 10:45:00 GMT

Testing improves software. So much so that lack of unit tests is called technical debt and blanket statements from celebrated engineers like Any programmer not writing unit tests for their code in 2007 should be considered a pariah are uncontroversial. When a defect is noticed in software it's easy to say it could have been found by better testing, and often it's simple to add a test that would catch it's recurrence. Done well tests can be very helpful. However, they can also be harmful: in particular when they cause people to be overly confident about their understanding of the consequences of a change.

A good test
— covers the code that runs in production
— tests behaviour that actually matters
— does not fail for spurious reasons or when code is refactored

For example, I made a change to the date parsing function in Wine, Here adding a unit test to record the externally defined behaviour is uncontroversial.

Tests do take time. The MS paper suggests that they add about 15-35% more development time. If correctness is not a priority (and it can be reasonable for it not to be) then adding automatic tests could be a bad use of resources: the chance of the project surviving might be low and depend only on a demo, so taking on technical debt is actually the right choice. More importantly, tests take time from other people: especially if some subjective and unimportant behaviour is enshrined in a test, then the poor people who come later to modify the code will suffer. This is especially true for engineers who aren't confident making sweeping refactorings, so that adding or removing a parameter from an internal function is turned into (for them) a tiresome project. The glib answer is not to accept contributions from these people, but that's really sad — it means rejecting people from diverse backgrounds with specialised skills (just not fluent coding) who would contribute meaningfully otherwise.

Unit tests in particular can enshrine a sort of circular thinking: a test is defined as the observed behaviour of a function, without thinking about whether that behaviour is the right behaviour. For example this change I made to Pandas involved more changing of test code than real code that people will use. This balance of effort causes less time to be spent on improving the behaviour.

In my experience, the worst effect of automatic tests is the shortcut they give to engineers — that a change is correct if the tests pass. Without tests, it's obvious that one must think hard about the correctness of a change and try to validate it: with tests, this validation step is easy to rationalise. In this way, bugs are shipped to production that would have been easy to catch by just running the software once in a setting closer to the production one.

It's hard to write a good test and so, so much easier to write a bad test that is tautologically correct, and avoids all behaviour relevant to production. These bad tests are easy to skip in code review as they're typically boring to read, but give a warm fuzzy feeling that things are being tested — when they're not. Rather than counting the coverage of tests as a metric, we could improve it by using test coverage of the real code that runs in production. Unfortunately, these are not the same thing. False confidence from irrelevant tests measurably reduces reliability.

John Fremlin's blog: Bad unit tests impart a false sense of security

John Fremlin's blog: Bad unit tests impart a false sense of security

Waiting for updates: connected

Recommend

A simple React Native date input component

React Native benchmarking library inspired by benchmark.js

首席创业官周报：奇瑞捷途被举报虚假宣传 *ST拉夏净亏13.48亿

Blog Review: April 14

中国的能源数字化玩家们，正在改写FleetCor的行业传奇

犀牛财经早报：冯柳减持海康威视绿地香港三年业绩未达标

C++Builder with David Millington

犀牛财经视频：格力高瓴，似离似和

珠江股份全年业绩降369%至亏5.9亿元资产负债率升至90%

Kubernetes Podcast from Google: Episode 146 - Kubernetes 1.21, with Nabarun Pal

About Joyk