Choosing a Cache: CapabilitiesOctober 26, 2021
Today, I'd like to provide some help on how to choose a cache solution. I will organize it into two parts:
- In this post, we will list what features a cache must have and which ones it can optionally provide. Most criteria are general and can be used regardless of the tech stack, while a couple is specific to the JVM.
- In the second part, I'll list providers and verify their respective capabilities
First, let's bust a common myth. Using a cache is not the sign of a badly-designed system per se, though it might be the case. Like in many design decisions, a cache is a trade-off.
My favorite example is an e-commerce shop implemented via a microservice architecture. Each capability is a micro-service:
Now, imagine that the user has items in their cart and clicks to checkout. Server-side, the checkout service sends a request to the pricing service to get a quote for the cart. At this point, we have two requirements:
- Pricing data must be available: if the pricing service is down, the checkout will fail as well as the sale.
- Pricing data must be available fast: if the service is up, but the user waits too long, they may give up. While the term "long" is subjective, a 100ms latency has a definitive impact on sales.
From a scientific point of view, wrong data is terrible. From an e-commerce one, it's better to sell at a slightly outdated price than to lose sales.
Hence, caching is a trade-off where you accept stale data to have them available/fast.
Mandatory cache features
You are probably familiar with the quote, "Don't roll your own cryptography library". It hints that designing such a library may look simple at first glance, but chances are you'll make a significant security mistake if you're not a security expert - and even so. You shouldn't design your own cache either, but for slightly different reasons.
You might think that a cache is just an In-Memory Key-Value Store. It's what a hashtable data structure precisely is. Depending on the language, the structure has a different name: map in Go, dictionaries in Python,
HashMap in Java and Rust, Hash in Ruby, etc. Whatever the stack, we can model a cache with such structures.
As a junior developer, I believed it as well, but I've changed my mind since then. A professional cache provides additional features that a mere hashtable doesn't.
Let's start with a simple feature.
The longer an application stays up, the bigger its cache will potentially grow. Depending on the exact usage, e.g., if one caches many entries with different keys, it can grow even more. An unbounded cache will compete with your application regarding memory usage, up to the point where no memory will be available anymore. That's something you want to avoid!
When a cache has hit its size limit, which entry do we remove when a new entry arrives? Choosing the entry to remove is known as the eviction strategy. A couple of such strategies are pretty widespread:
|Name||Evict the entry...||Requirement|
|Random||... at random|
|FIFO||... that was first added to the cache||Keep hashtable entries in a linked list to support insertion-order traversal|
|LIFO||... that was _last_ added to the cache||Keep hashtable entries in a stack structure to support reverse insertion-order traversal|
|LRU||... that was the least recently accessed||
|LFU||... that is the least frequently accessed||
You can find other possible strategies on Wikipedia.
You might know about the quote:
There are two hard things in computer science:
- Cache invalidation
- Naming things
- And off-by-one errors
It relates to how long the cache considers an entry valid before it removes it. When you add an entry to the cache, you should set the duration after it becomes stale.
A possible implementation is to add a field to each entry: the timestamp when the entry will expire (current time + TTL). A thread may occasionally visit entries and remove the expired ones eagerly. Alternatively, the cache may evict the expired entries lazily when it needs more space.
Other criteria are optional but still worthy of consideration. Here they are, in no particular order.
While configuration is not a feature, it impacts the developer experience. As such, it should be a part of any analysis regarding the choice of a cache.
Some cache may be able to run out-of-the-box with sensible defaults, but others may require explicit configuration. You probably need to configure a couple of parameters in all cases, such as the size limit.
Two options are possible: file-based configuration and programmatic configuration. Of course, a third option is to provide both.
Integration with cache abstractions
The JVM ecosystem has an official Cache API, known as JCache, or JSR 107. It's a specification with an API that describes four annotations, i.e.,
@CacheRemoveAll. Vendors are to implement the specification.
The Spring framework is pretty widespread in the JVM ecosystem. It also provides a caching API. Historically, it predates JCache. While different, the API is very similar to JCache's. Spring offers out-of-the-box integration code for a couple of caches, while a couple of others do provide Spring integration.
I've described several caching patterns in the following talk.
Because it's pretty long, here's a summary:
Generally, people start with Cache-Aside, i.e., the application orchestrates the reads/writes between the cache and the source of truth. However, a cache's true power lies in the more advanced patterns.
Distributed vs. local
Early caches shared the same runtime as the application. Then, architects designed caches that ran in their process. In parallel, you can choose from single-node and distributed caches, caches made out of nodes belonging to the same cluster.
The idea behind a distributed cache is to pool multiple nodes together to appear as a single storage unit. If one needs more storage, one adds more nodes. It's the principle behind horizontal scaling.
While conceptually simple, it opens a lot of new options. For example, you can replicate entries across several nodes so that the failure of a node doesn't mean data loss. Another possible capability is to put entries on specific nodes based on a property of the entry: this capability is known as sharding. This way, finding an entry becomes faster as the cache doesn't need to request each node for data but knows which node the data is located on. Of course, the cluster can provide both replication and sharding.
In addition, with a big enough storage capacity, one can use the cache as an in-memory database. A cache is a key-value store: the usual use case is to retrieve an entry by its key. Historically, databases have had a much larger scope, and provided querying capabilities, i.e.,
SELECT * FROM Foo. Hence, a distributed cache can also offer such capabilities via a dedicated API or a SQL-like syntax.
Once the cache is able to pool memory across nodes, it can pool CPUs as well. At this point, the cache has become a data grid. One can send tasks, which the cluster executes in parallel across its nodes. Most importantly, the cache can ensure the tasks run close to the data they access, eliminating network traffic.
With a distributed cache, the architecture is client-server: the application is the client; the cache is the server. To maximize your investments, you will probably want to share your data across multiple clients. Clients can belong to different languages, Java and JVM languages, but also others: C#, C, C++, Ruby, Python, Go, Rust, Erlang, etc. It's worth checking which bindings the cache offers regarding the languages you're using.
Of course, none of these features are "free". A distributed cache is a distributed system and comes with all their pitfalls. One important criterion to look into is how nodes form a cluster over the network. For example:
- Is there an auto-discovery mechanism?
- If yes, can it be disabled?
- Can you configure more than one cluster on the same network?
- Does it work on Kubernetes?
The goal of a cache is to improve performance, as it's faster to access local in-memory data than data on disk or over the network. If data access takes long, either reads or writes, blocking slows down the whole client code. To solve this issue, caches can provide a non-blocking API.
You need to consider several aspects; the most important one is the API used by the cache. For example,
CompletableFuture requires Java 8. Depending on the stack you're using, you might favor a cache that integrates with RxJava, Project Reactor, Kotlin coroutines, or any combination thereof.
Standard project's health indicators
Besides all criteria mentioned above, I'd advise you to consider indicators that should be part of every product evaluation:
- License: mainly Open Source vs. commercial, but depending on your usage, not every Open Source license is compatible
- Pricing, if commercial
- Project maturity: check the project's inception date. The reasoning is that you can probably rely more on an "old" project than on one created yesterday.
- Number of core committers - the Bus Factor
- Number of non-core committers
- Committers' commit history
- Number of open issues
- Median time to fix them
- If a core committer stopped working on the project
- Documentation: while this term is very generic, good documentation is made of reference material, tutorials, how-to guides and explanations. If you have never stumbled upon these terms before, read this page or watch this video. I only started to get documentation after watching the talk.
- Community: how large is it? How active? How helpful?
- Support: what are the support channels? Stackoverflow? Google Groups? Slack? How often do questions get an answer?
In this post, I described several criteria on which to base your choice of cache provider. I'll attempt to list and compare the most common Open Source cache providers in the JVM ecosystem in the next post.
Many thanks to my colleague Marko Topolnik for his review.
To go further:
- Wikipedia's page on cache
- List of cache replacement policies
- A Guided Tour of Caching Patterns (video)
- Spring's Cache Abstraction
Originally published at A Java Geek on October 24th, 2021