Testing

Three Key Elements to Incorporate into Your Flaky Test Remediation Approach

August 29, 2023
5 min read

Likes ...

Comments ...

Table of Contents

1. Deploy best practice strategies
2. Align your process & resources
3. Understand the causes of flaky tests
Conclusion & next steps

Flaky tests pose substantial challenges due to their unpredictable and inconsistent nature. Effectively addressing them requires a multi-faceted approach that involves the effective integration of strategy, process and resource alignment, and a deep understanding of flaky test causality. This post will walk you through this approach.

Note! This post is part of a three-part series. If you’re not sure it’s worth remediating flaky tests, read Part 1: Seven Reasons You Should Not Ignore Flaky Tests. Read Part 2 to understand the keys to identifying and tracking flaky tests, called 5 Ways to Use Gradle Enterprise to Identify and Manage Flaky Tests. Now, let’s explore my multi-faceted approach to fixing flaky tests.

1. Deploy best practice strategies

Once you have identified which of your tests are flaky, you can use one of these strategies to mitigate the problems they cause.

Screenshot of Gradle Enterprise test failures dashboard

1. Quarantine Flaky Tests: Isolate flaky tests to prevent them from disrupting the development process and distracting developers from genuine failures. Once quarantined, these tests can be analyzed separately, freeing developers to focus on legitimate failures.

2. Improve Error Reporting: You may need more information in order to find the causes of the test failures. Enhancing your error reporting can significantly aid in handling flaky tests. This can be achieved by adding assertions, checking preconditions, and logging more details about the test environment and state.

3. Retry with Care: While retrying can be a useful tool in identifying flaky tests, it’s not a strategy for solving the problem. Retrying until a test passes masks the intermittency and wastes resources in CI and locally.

4. Commit to Fixing Flaky Tests: Once you’ve tracked down the flaky tests, and perhaps improved the error reporting and quarantined them from interfering with your team’s productivity, the goal should be to fix the test.

You can also read about how the Gradle Build Tool team handles flaky tests.

2. Align your process & resources

If you don’t want to rely on a few individuals with the discipline and determination to fix your flaky tests, you’ll need to implement some process changes to make sure time is allocated to fixing the problems.

When a developer commits a change that breaks a test, the developer or developers who worked on that change usually start working on fixing that test. This is a well-accepted approach to fixing breaking changes, but it equally applies to tests that start to fail intermittently.

If your application already has a number of flaky tests that aren’t owned by a developer, you may want to schedule regular Flaky Test Days. These dedicated sessions not only aim to decrease the number of flaky tests in your test suite, they also emphasize the importance of addressing test flakiness, and foster a culture of collective responsibility toward improving test reliability.

3. Understand the causes of flaky tests

The causes of test intermittency are varied and nuanced, as discussed by Dave Farley in his video, 5 Reasons Automated Tests Fail, and collated in a research paper on the impact and causes of intermittent tests. Each test may be a unique case, but you may also find that one cause of intermittency affects multiple tests.

Here are some common causes of test intermittency. Note that these categories can overlap, but considering each failure from one of these angles may lead to identifying a fix for the failure.

1. Concurrency, Asynchronous Programming, and Waiting: Asynchronous and concurrent programming pose specific challenges to testing. Tests often have to wait for events to happen before taking the next steps or may run into race conditions in either the test code or production code.

There may be environmental factors in these failures too, since tests may time out more frequently if the test environment is under a high load.

Screenshot of the time taken to run the test over time

2. Environment, Network, and Resources: Variations in testing environments or network conditions, as well as insufficient compute resources, can result in inconsistent test behavior.

Gradle Enterprise can help you identify some of these issues—it will show details about the environment the tests ran in so you can compare the test results from different environments.

Screenshot of build scan's infrastructure page

3. Integration Points: Tests depending on external systems or services (integration points) may be flaky due to the unpredictable nature of these dependencies. This includes other services from inside your organisation, as well as third-party libraries and APIs or systems that are external to your organisation.

Integration tests against external systems are valuable since they can tell you if your assumptions about the system are correct. However, tests that are designed to run against systems that can change without warning should be kept separate from the main test suite due to their inherently uncertain behaviour.

And the main test suite should protect itself from these integration points by mocking and stubbing the expected behaviour.

Screenshot of results of integration test running

4. Setup/Teardown and Test Data: Test results can only be predictable if the start state of the test and end state of the test are also predictable. If the tests rely on shared state, shared data, or shared resources (like a database), this can be a contributor to intermittency in the tests. It’s key to make sure the tests run in isolation so they don’t impact the data from other tests.

Even when the data is isolated from other tests, you may still run into unpredictable results if your test or production code is something that’s randomly generated, or related to date and time. You may want to inject a custom provider of random values or date/time into your production code so that you can control these values from the test.

Screenshot of a test failure probably caused by two tests using the same resources

5. System Behavior: While it’s easy to assume it’s some problem with the environment or test data that’s causing an intermittent failure, sometimes the problems lie in the production code of your application.

For example, test environments can sometimes trigger genuine race conditions in concurrent code or uncover bugs in third-party libraries. These can sometimes be the most difficult issues to identify but are arguably the most important reason to address flaky tests.

Conclusion & next steps

Efficient management of flaky tests is a combination of strategic actions, process changes, and a deep understanding of root causes.

By weaving these elements together, your team can effectively navigate the challenges posed by flaky tests, ensuring the delivery of high-quality, reliable software.

August 29, 2023
5 min read

Likes ...

Comments ...

Testing

Trisha Gee

Author

Engineer, author, keynote speaker, developer champion, catalyst. Developer Advocate @ Gradle.

Java 26: What’s New?

🤖 5 Best Practices for Working with AI Agents, Subagents, Skills and MCP

How We Built a Java AI Agent by Connecting the Dots the Ecosystem Already Had

JavaScript (No, Not That One): Modern Automation with Java

Service Layer Pattern in Java With Spring Boot

AI Test Generation: A Dev’s Guide Without Shooting Yourself in the Foot

Does Java Really Use Too Much Memory? Let’s Look at the Facts (JEPs)

Grails Isn’t Done Yet (Part 2): EOL, Spring Boot, and What Comes Next

From Zero to Full Observability with Dash0

Introducing Floci: A High-Performance, GraalVM-Powered AWS Emulator

foojay: A Place for Friends of OpenJDK

Dashboard for OpenJDK Update Release Details

JDK14: New Features and Enhancements

Fun with Flags: My Top 10 Resources for JVM Flags

Performance of Modern Java on Data-Heavy Workloads: Real-Time Streaming

Performance of Modern Java on Data-Heavy Workloads: Batch Processing

How does Java handle different Images and ColorSpaces – Part 1

How does Java handle different Images and ColorSpaces – Part 2

How does Java handle different Images and ColorSpaces – Part 3

How does Java handle different Images and ColorSpaces – Part 4

Indexing all of Wikipedia, on a laptop

Working with Multiple Carets in IntelliJ IDEA

Clean Shutdown of Spring Boot Applications

Project Panama for Newbies (Part 1)

Java 17 on the Raspberry Pi

How to Create Mobile Apps with JavaFX (Part 1)

Beginning JavaFX Applications with IntelliJ IDE

SpringBoot 3.2 + CRaC

Foojay Slack: bit.ly/join-foojay-slack

Preparing for Spring Framework 7 and Spring Boot 4

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Learn about a number of experiments that have been conducted with Apache Kafka performance on Azul Platform Prime, compared to vanilla OpenJDK. Roughly 40% improvements in performance, both throughput and latency, are achieved.

Stable, Secure, and Affordable Java

Azul Platform Core is the #1 Oracle Java alternative, offering OpenJDK support for more versions (including Java 6 & 7) and more configurations for the greatest business value and lowest TCO.

Foojay Podcast #26: The Future of Source Control and CI/CD

We have seen evolutions from CVS to Subversion to Git. Is this the endpoint? Did we find the holy grail? What evolutions are waiting for us?

Jun 26 5,3K

Frank Delporte

Hanno Embregts

Trisha Gee

Podcast

Maven Microservices JFrog Artifactory IntelliJ IDEA GitLab DevOps Developer Tools

Bar chart showing number of flaky tests per build

Seven Reasons You Should Not Ignore Flaky Tests

Flaky tests might seem like a minor annoyance, but in fact they are a major blocker to developer productivity. Here’s seven reasons why.

Jul 04 4,8K

Trisha Gee

Testing

Why I prefer trunk-based development

Trisha summarizes the advantages of trunk-based development (as opposed to branch-based development) in this article.

Jun 23 17,5K

Trisha Gee

Opinion

Agile

7 Reasons Why, After 26 Years, Java Still Makes Sense!

After many discussions with Java developers, combined with my personal experiences with the Java community and platform, here are the key reasons why Java developers love Java after all these years!

Mar 15 30,2K

A N M Bazlur Rahman

Java Core

Opinion Java Beginner

9 Outdated Ideas About Java

In this article, we want to look into some false assumptions and outdated ideas about Java based on early versions.

Mar 11 21,3K

Frank Delporte

Opinion

JBang Java Core

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Stable, Secure, and Affordable Java

Jakarta EE 11: Beyond the Era of Java EE