Advanced Analytics for Caching

Sascha ~ June 9, 2020

Tags: Caching, Analytics

In recent month our analytics dashboard has received a fresh new look. More than just the surface has changed though, the whole stack has been re-engineered from scratch. This new stack allowed us to add several new features such as more comprehensive metrics and the ability to set alerts. Now we are finally ready to introduce the first advanced feature our new analytics stack can offer: cache introspection.

Cache Introspection

This new feature allows you to introspect a cache to get key based stats. This means it can help you answer questions such as: “which keys are most valuable to cache?” or “are there sets of keys that occupy space but are never fetched?”. It can even show you the performance of different parts of your cache.

The cache introspection dashboard currently has three views that show hot keys, per prefix stats, and a log of recent requests.

Hot keys

This view shows you the most hit, missed, and mutated keys in the last few minutes. The most hit keys basically show you where you are doing a good job at caching and what cached items are important in your application.

The most missed keys can be more interesting. It is common practice in caching to store the value after it has been missed so in a well implemented caching strategy, keys should never be missed frequently. If a key is missed more than once it means that it was either evicted or not stored. If in the span of a few minutes a key is missed, stored, evicted, and then missed again it probably means your cache is not big enough. If your key is missed very frequently you are probably not storing it. This means you probably have a bug in your code and need to figure out why the key is not being stored.

The most mutated keys can also be interesting. While certain values are expected to change frequently, such as counters, sometimes the most mutated keys can show an apparent bug. For example, if a key is obviously not a counter or has no reason to change frequently, this key is probably being continuously stored without any need for it. Finding and removing unused storage requests will conserve your bandwidth and speed up your application.

Prefix stats

Since we created our first analytics dashboard years ago we always put performance front and center by highlighting the hit rate over time, the most important figure when assessing the performance of a cache. With prefix stats you can now drill deeper into the performance of your cache by analysing the the hit rate for different aspects of your caching strategy.

To start seeing prefix stats you need to watch a prefix. To give you a place to start we will show you the prefixes that were most active recently. Watching a prefix will start the data gathering process and after a while you should see the hit rate for keys with this prefix as well as the weight the prefix has, i.e., the fraction of items and memory it is currently consuming.

Clicking on a watched prefix will reveal more details and historical data. You can see how the hit rate, weight, and activity has changed over time. This view also allows you to see key stats for the most active keys in this prefix. Last but not least we will also suggest some frequently used sub-prefixes you could add to your watchlist.

All in all, watching prefixes can reveal important aspects about your caching strategy. First and foremost it highlights the performance of individual aspects of your cache, e.g., show you how template or session caching is performing in relation to the rest of the cache. Along the same line it could show you a part of your cache that might benefit from preemptive caching. It also has the potential to show you bugs, e.g., show a prefix you cache but never fetch. Finally, you might also discover prefixes that are not worth caching, i.e., a prefix that uses a large memory but small items fraction and has a low hit rate.

Pro tip: we only suggest the recently most active prefixes to watch. However, less active prefixes might also be interesting. For example, looking at the most missed keys might hint at a prefix that, while less active, highlights a bug in your caching implementation.

Recent requests

Lastly, our introspection feature can show some (last 100 per server) of your recent request. This is mostly useful when you need to test or debug your cache.

Feedback

Our new analytics stack can do much more so this is just the beginning. We do have a few ideas about additional features but we would like to know from you, what features would you like? If you have an idea or need analytics beyond what our dashboard offers, let us know. Also, if it is unclear to you how to use the introspection features or what they can do for your cache specifically, contact us.