Reaper 3.0 for Apache Cassandra is available

March 18, 2022
6 min read

Likes ...

Comments ...

The K8ssandra team is pleased to announce the release of Reaper 3.1. Let’s dive into the features and improvements that 3.0 recently introduced (along with some notable removals) and how the newest update to 3.1 builds on that.

JDK11 support

Starting with 3.1.0, Reaper can now compile and run with jdk11. Note that jdk8 is still supported at runtime.

Storage backends

Over the years, we regularly discussed dropping support for Postgres and H2 with the The Last Pickle (TLP) team, now part of DataStax, the organization leading the open-source development of Reaper. Despite our lack of expertise in Postgres, the effort required to maintain support for these storage backends was moderate as long as Reaper’s architecture was simple. However, complexity grew with more deployment options, culminating with the addition of the sidecar mode.

Some features require different consensus strategies depending on the backend, which sometimes led to implementations that worked well with one backend and were buggy with others.

In order to allow building new features faster, while providing a consistent experience for all users, we decided to drop the Postgres and H2 backends in 3.0.

Apache Cassandra and the managed DataStax Astra DB are now the only production storage backends for Reaper. The free tier of Astra DB will be more than sufficient for most deployments.

Reaper does not generally require high availability – even complete data loss has mild consequences. Where Astra is not an option, a single Cassandra server can be started on the instance that hosts Reaper, or an existing cluster can be used as a backend data store.

Adaptive Repairs and Schedules

One of the pain points we observed when people start using Reaper is understanding the segment orchestration and knowing how the default timeout impacts the execution of repairs.

Repair is a complex choreography of operations in a distributed system. As such, and especially in the days when Reaper was created, the process could get blocked for several reasons and required a manual restart. The smart folks that designed Reaper at Spotify decided to put a timeout on segments to deal with such blockage, over which they would be terminated and rescheduled.

Problems arise when segments are too big (or have too much entropy) to process within the default 30 minutes timeout, despite not being blocked. They are repeatedly terminated and recreated, and the repair appears to make no progress.

Reaper did a poor job at dealing with this for mainly two reasons:

Each retry will use the same timeout, possibly failing segments forever
Nothing obvious was reported to explain what was failing and how to fix the situation

We fixed the former by using a longer timeout on subsequent retries, which is a simple trick to make repairs more “adaptive”. If the segments are too big, they’ll eventually pass after a few retries. It’s a good first step to improve the experience, but it’s not enough for scheduled repairs as they could end up with the same repeated failures for each run.

This is where we introduce adaptive schedules, which use feedback from past repair runs to adjust either the number of segments or the timeout for the next repair run.

Adaptive schedules will be updated at the end of each repair if the run metrics justify it. The schedule can get a different number of segments or a higher segment timeout depending on the latest run.

The rules are the following:

If more than 20% segments were extended, the number of segments will be raised by 20% on the schedule.
If less than 20% segments were extended (and at least one), the timeout will be set to twice the current timeout.
If no segment was extended and the maximum duration of segments is below 5 minutes, the number of segments will be reduced by 10% with a minimum of 16 segments per node.

This feature is disabled by default and is configurable on a per schedule basis. The timeout can now be set differently for each schedule, from the UI or the REST API, instead of having to change the Reaper config file and restart the process.

Incremental Repair Triggers

As we celebrate the long awaited improvements in incremental repairs brought by Cassandra 4.0, it was time to embrace them with more appropriate triggers. One metric that incremental repair makes available is the percentage of repaired data per table. When running against too much unrepaired data, incremental repair can put a lot of pressure on a cluster due to the heavy anti-compaction process.

The best practice is to run it on a regular basis so that the amount of unrepaired data is kept low. Since your throughput may vary from one table/keyspace to the other, it can be challenging to set the right interval for your incremental repair schedules.

Reaper 3.0 introduced a new trigger for the incremental schedules, which is a threshold of unrepaired data. This allows creating schedules that will start a new run as soon as, for example, 10% of the data for at least one table from the keyspace is unrepaired.

Those triggers are complementary to the interval in days, which could still be necessary for low traffic keyspaces that need to be repaired to secure tombstones.

Percent unrepaired triggers — Figure 2: Setting interval for incremental repairs.

These new features will allow you to securely optimize tombstone deletions by enabling the only_purge_repaired_tombstones compaction subproperty in Cassandra, permitting it to reduce gc_grace_seconds down to three hours without the concern that deleted data will reappear.

Schedules can be edited

That may sound like an obvious feature but previous versions of Reaper didn’t allow for editing of an existing schedule. This led to an annoying procedure where you had to delete the schedule (which isn’t made easy by Reaper either) and recreate it with the new settings.

Version 3.0 fixed that embarrassing situation and adds an edit button to schedules, which allows you to change the mutable settings of schedules:

Edit Repair Schedule — Figure 3: Reaper now has the ability to edit the settings for scheduled actions.

CVE fixes

With the release of Reaper 3.1.0, we were able to fix more than 80 reported CVEs by upgrading several dependencies to more current versions:

Dropwizard 2.0.25
Shiro 1.8.0
SnakeYAML 1.29
Netty 4.1.70.Final
Cassandra Java Driver 3.11.0
Jersey 2.33
Prometheus Simple Client 0.12.0

This allows Reaper to be more secure and future proof as it now enables us to migrate from the deprecated dropwizard-cassandra bundle to the officially supported one, along with upgrading the Cassandra driver to the latest 4.x.

More improvements

In order to protect clusters from running mixed incremental and full repairs in older versions of Cassandra, Reaper would disallow the creation of an incremental repair run/schedule if a full repair had been created on the same set of tables in the past (and vice versa).

Now that incremental repair is safe for production use, it is necessary to allow such mixed repair types. In case of conflict, Reaper 3.0 displays a pop-up informing you and allowing you to force create the schedule/run:

Force bypass schedule conflict — Figure 4: Reaper now shows a pop-up to inform you of a conflict and allowing to force create the schedule/run.

We’ve also added a special “schema migration mode” for Reaper, which will exit after the schema was created/upgraded. We use this mode in K8ssandra to prevent schema conflicts and allow the schema creation to be executed in an init container that won’t be subject to liveness probes that could trigger the premature termination of the Reaper pod:

java -jar path/to/reaper.jar schema-migration path/to/cassandra-reaper.yaml

There are many other improvements and we invite all users to check the changelog in the GitHub repo.

Upgrade Now

We encourage all Reaper users to upgrade to 3.1.0, while recommending users to carefully prepare their migration out of Postgres or H2. Note that there is no export/import feature and schedules will need to be recreated after the migration.

All instructions to download, install, configure, and use Reaper 3.1.0 are available on the Reaper website.

Let us know what you think of Reaper 3.1 by joining us on the K8ssandra Discord or K8ssandra Forum today. For exclusive posts on all things data, follow DataStax on Medium.

Resources

March 18, 2022
6 min read

Likes ...

Comments ...

Alexander Dejanovski

Author

Software Engineer at Datastax

Project Panama for Newbies (Part 1)

SpringBoot 3.2 + CRaC

The Java Story: A Film About All of Us

New Between-Quarters Security Updates for Java: What CSPUs Mean for Your Release Pipeline

Toward a Durable Spring PetClinic

First Test of Java on Banana Pi (ARM and RISC-V), Plus a Blinking LED with Pi4J

Creating Scalable OpenAI GPT Applications in Java

Foojay Podcast #92: Java 26 Is Here: What’s New, What’s Gone, and Why It Matters in 2026

Temporal Is to Your Code What a Database Is to Your Data

🤖 5 Best Practices for Working with AI Agents, Subagents, Skills and MCP

foojay: A Place for Friends of OpenJDK

Dashboard for OpenJDK Update Release Details

JDK14: New Features and Enhancements

Fun with Flags: My Top 10 Resources for JVM Flags

Performance of Modern Java on Data-Heavy Workloads: Real-Time Streaming

Performance of Modern Java on Data-Heavy Workloads: Batch Processing

How does Java handle different Images and ColorSpaces – Part 1

How does Java handle different Images and ColorSpaces – Part 2

How does Java handle different Images and ColorSpaces – Part 3

How does Java handle different Images and ColorSpaces – Part 4

Indexing all of Wikipedia, on a laptop

Working with Multiple Carets in IntelliJ IDEA

Clean Shutdown of Spring Boot Applications

Project Panama for Newbies (Part 1)

Java 17 on the Raspberry Pi

How to Create Mobile Apps with JavaFX (Part 1)

Beginning JavaFX Applications with IntelliJ IDE

SpringBoot 3.2 + CRaC

Preparing for Spring Framework 7 and Spring Boot 4

Foojay Slack: bit.ly/join-foojay-slack

A Case for Databases on Kubernetes from a Former Skeptic

Looking back at the pitfalls of running databases on Kubernetes I encountered several years ago, most of them have been resolved.

All of these problems are hard and require technical finesse and careful thinking. Without choosing the right pieces, we’ll end up resigning both databases and Kubernetes to niche roles in our infrastructure, as well as the innovative engineers who have invested so much effort in building out all of these pieces and runbooks.

Jul 13 3,6K

Christopher Bradford

Kubernetes

DevOps Databases DataStax Apache Cassandra

Announcing the Astra Service Broker: Tradeoff-Free Cassandra in Kubernetes

Today, we are releasing the DataStax Astra Service Broker, so you can seamlessly integrate Cassandra into your Kubernetes deployments and leave the operations to somebody else.

In this article, we’ll show you exactly how easy it is to use Astra with Kubernetes, and make you wonder why anyone would do anything else.

Nov 19 2,4K

Christopher Bradford

DevOps

Kubernetes Databases Apache Cassandra

Apache Cassandra 4.0: Taming Tail Latencies with Java 16 ZGC

With Apache Cassandra 4.0, you not only get the direct improvements to performance added by the Apache Cassandra committers, you also unlock the ability to take advantage of seven years of improvements in the JVM itself.

This article focuses on improvements in Java garbage collection that Cassandra 4.0 coupled with Java 16 offers over Cassandra 3.11 on Java 8.

Jun 22 4,5K

Jonathan Ellis

Performance

Apache Pulsar Apache Cassandra

Free eBook: Sustainability for Java Developers

Cut Code Review Time & Bugs in Half. Instantly.

Modernizing Java with Jakarta EE 11

Reaper 3.0 for Apache Cassandra is available

JDK11 support

Storage backends

Adaptive Repairs and Schedules

Incremental Repair Triggers

Schedules can be edited

CVE fixes

More improvements

Upgrade Now

Resources

Alexander Dejanovski

Alexander Dejanovski

Thanks to our Sponsors!

Azul

Redis

CodeRabbit

Reo

Zencoder

Digma

adesso

Trending

Free eBook: Sustainability for Java Developers

Modernizing Java with Jakarta EE 11

Cut Code Review Time & Bugs in Half. Instantly.

Comments (0)

Free eBook: Sustainability for Java Developers

Cut Code Review Time & Bugs in Half. Instantly.

Modernizing Java with Jakarta EE 11

Do you want your ad here?

Reaper 3.0 for Apache Cassandra is available

JDK11 support

Storage backends

Adaptive Repairs and Schedules

Incremental Repair Triggers

Schedules can be edited

CVE fixes

More improvements

Upgrade Now

Resources

Alexander Dejanovski

Alexander Dejanovski

Thanks to our Sponsors!

Azul

Redis

CodeRabbit

Reo

Zencoder

Digma

adesso

Trending

All 0 Likes

Free eBook: Sustainability for Java Developers

Modernizing Java with Jakarta EE 11

Cut Code Review Time & Bugs in Half. Instantly.

Do you want your ad here?

Related Articles

Comments (0)

Set Event Reminder

Subscribe to foojay updates:

Share with