5

Solving Flaky Tests in RSpec

 2 years ago
source link: https://flexport.engineering/solving-flaky-tests-in-rspec-9ceadedeaf0e
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Solving Flaky Tests in RSpec

Increasing the reliability of Flexport’s test suite with new Quarantine gem.

Flaky tests are an unavoidable nuisance at every company, and Flexport is no exception. With the recent growth of our Engineering Team, our monolith’s test suite has ballooned from 14k to 16k tests in just the past four months. Throughout all of this growth, our master branch success rate has dropped to 70% with flaky tests responsible for 50% of all failures. There are many reasons why these tests exhibit flaky behaviors ranging from non-deterministic functions, such as expect(db_query).to eq([1, 2, 3]) to having shared mutable data between tests. With the enormous influx of code being committed every day, our tests suite have quickly grow out of control and we needed a solution.

1*ZjdpNTXgETUqXNOdM_1hMg.png?q=20
solving-flaky-tests-in-rspec-9ceadedeaf0e
Illustrations by Bailey McGinn

Why are flaky tests a problem?

Flaky tests caused many of our engineers to lose faith in the reliability of our test suite. Many engineers were unsure if tests were failing because of their changes or because of flaky tests, resulting in countless hours rebuilding and debugging unrelated code to their feature. This was an enormous source of frustration for our engineers and directly impacted our engineering velocity.

Another headache due to flaky tests were our abysmal master pass-rates. Each failing master build required exhausting manual investigation and diagnosis from the infrastructure team, wasting valuable development time. In some cases, engineers might have dismissed a failing test in a build as flaky and disabled it only to later realize that it was a legitimate failure which went on to cause production errors. This may have become a common occurrence since it is only human nature to ignore alarms when there is a history of false signals coming from a system. (Google Testing Blog, 2016). As a result, flaky tests have serious implications in terms of time and resources and can directly impact production reliability.

1*uGV6MVeVZXI95LQGeeyjjw.jpeg?q=20
solving-flaky-tests-in-rspec-9ceadedeaf0e

Flexport’s solution

To combat flaky tests, we have created Quarantine, which is an open-source Ruby gem we use to maintain a list of flaky tests that would be skipped during runtime. Before test execution, Quarantine will download a list of all flaky tests and prevent them from being run. During test execution, Quarantine will automatically retry failing tests. If a test passes after previously failing in the same build, it will be marked as a flaky test and will be added to the list of flaky, quarantined tests. The gem aims to automate the flaky test workflow and create a quicker feedback loop to maintain a pristine test suite state.

1*bfHz0IP7jZratRaQR6ploA.png?q=20
solving-flaky-tests-in-rspec-9ceadedeaf0e

How has Quarantine changed our test suite?

Overall, we believe Quarantine has provided a positive impact to our development lifecycle. So far, we have quarantined over 60 flaky tests, and our master build success rates have improved from 70% to 95%. Metrics aside, quarantine has had the following impact on engineering velocity and developer experience.

  • Now when tests fail during CI builds, engineers can be fairly certain it’s their code causing breaking changes and not flaky tests
  • With less noise in failing builds, it is much quicker to investigate failing master builds, which helps us recognize legitimate failures
  • Quicker turn-around time between flaky test detection, disabling, and resolution
  • A centralized location to view flaky tests instead of random xit across the code base
1*56y2wKkZaT4Dk6etBUwz4g.png?q=20
solving-flaky-tests-in-rspec-9ceadedeaf0e
Master build statistics (April 8 to April 15)

Managing Quarantine and Possible Pitfalls

At first, the notion of automatically skipping a set of tests may seem scary. If mismanaged, quarantining tests can potentially lead to disabling important tests, and in turn, fatal production errors. To mitigate this risk, you can disable Quarantine in development branches to ensure builds get the original code coverage. At Flexport, we are considering a variety of options ranging from giving tests a grace period before being quarantined to adding immediate alerting to the team owning the flaky test. In the end, it is important to evaluate the maturity of your test suite and determine what is more important for your code base: test suite stability or code coverage.

Looking Forward

Currently, we are in the process of adding a variety of extra tools to help make the Quarantine experience all the more seamless. This includes:

  • Automatic Jira ticket creation
  • Slack alerting
  • Un-quarantining test on Jira ticket completion
  • Greater configurability on quarantine options

We are also curious to hear how other teams have approached flaky tests. Feel free to reach out here or on Github with your ideas and feedback on our approach. Similarly, if you would like Quarantine to support your Ruby stack, or want to contribute to the gem, don’t be shy and fork our repository!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK