Revamped Status Page

We are happy to announce some wide-ranging improvements to our status page. While it looks the same on the surface, it has been rewritten from scratch and is now much more tightly integrated with our monitoring infrastructure. For our customers, this has three main consequences.

Fully automated status updates

Status changes are now fully automated. Over the last few months we have carefully tuned the sensitivity of the status page to the point where we are now happy with its accuracy and responsiveness. While we continue to improve the algorithms that monitor our infrastructure, the status page reflects the health of our infrastructure better than ever.

Better differentiated status

We now differentiate between increased latency and reachability for the status of the proxies in our clusters. For convenience we use a simple traffic light approach: green means all proxies in a cluster are up and running and responding quickly enough, yellow means one or more proxies in a cluster are responding more slowly than usual, and red means that our monitoring infrastructure is unable to reach the proxy at all.

“Recently Resolved”

Previous support incidents have revealed that customers often visit our status page after an incident has already been resolved.  They are then puzzled because all our clusters show green statuses when they have clearly just had trouble connecting to our servers. To avoid this confusion, we now flag such clusters as “Recently Resolved” and highlight them with the color blue.

All these changes are also reflected in the status RSS feed. Please let us know if you think these changes are useful or what other features you would like to see next. We have an ongoing program of improvements in system monitoring and fault diagnosis, but we’re always very happy to hear ideas from customers that would make their experience using MemCachier better!

New Security Features

We have recently added some additional features to MemCachier. In brief, you can now have multiple sets of credentials for a cache, can rotate credentials, and can restrict some capabilities on a per-credential basis. These features are all controlled from the Credentials panel of the MemCachier dashboard for your cache:

credentials-panel

The features described in this article are also explained in the MemCachier documentation.

Credentials

Caches may now have multiple sets of username/password credentials. The credentials for a cache are listed in the table in the Credentials panel on your cache’s analytics dashboard. New credentials for a cache are created using the Add new credentials button, and credentials may be deleted using the X button next to the credentials row.

One set of credentials is distinguished as primary, while the rest are secondary credentials. For caches provided through our add-on with a third-party provider (e.g., Heroku or AppHarbor), the primary credentials are linked to the configuration variables stored with the provider. That is, these are the credentials you see when you look at your application’s environment.  Secondary credentials can be promoted to primary using the up-arrow button next to the credentials row. When secondary credentials are promoted to primary, the username and password for the credentials being promoted are pushed to the provider’s configuration variables.

In practice, this means that you can rotate the credentials for a MemCachier cache associated with a Heroku app by creating a new set of credentials and promoting them to primary. This will push the new username and password to the MEMCACHIER_USERNAME and MEMCACHIER_PASSWORD configuration variables in your Heroku app, and trigger a restart so your application will pick up the new MemCachier credentials.

This ability to have both a primary and many secondary credentials lets you to rotate your credentials with zero downtime!

Capabilities

We’ve further enhanced multiple credentials by adding a security extension to our memcache implementation: capabilities! These control the operations available to a client once they authenticate with the chosen credentials.

Each set of credentials for a cache has associated write and flush capabilities. By default, new credentials have both capabilities, meaning that the credentials can be used to update cache entries and flush the cache via the memcached API.

You can restrict a client to read-only access by switching off the write capability for its credentials, and prevent a client from using the memcached API to flush your cache by switching off the flush capability.

Per-credential capabilities are managed using the checkboxes on each credential row in the Credentials panel: if the checkbox is checked, the credential has the capability; if not, the capability is disabled for that credential.

Dashboard SSO rotation

The MemCachier dashboard for your cache is accessed via a persistent unique URL containing a cache-specific single-sign-on hash. For security, you may wish to rotate this cache-specific hash to generate a new unique URL for the cache dashboard. The Rotate SSO secret button in the Credentials panel of the dashboard does this hash rotation. Pressing this button generates a new hash and redirects the dashboard to the resulting new unique URL.

Security @ MemCachier

We value security and hope that you find these new features valuable to improve your own practices. As always, for a guide to MemCachier’s own internal security practices, see our documentation.

Flush Command Logging for Heroku

We are happy to announce a new feature for our Heroku customers. In the past we have had several requests from customer who wanted to know why their caches had been flushed. To help our clients find out how a flush command came about we now push a log message to the Heroku log whenever a cache is flushed.

The log message contains the hostname of the proxy the flush command was executed on as well as its origin. The origin can be one of the following:

  • memcache client: The flush command originates from a normal memcached client.
  • web interface: This happens when a client clicks “Flush Cache” on the analytics dashboard.
  • admin operation: MemCachier flushed the cache on your behalf. This can occur when you do perform operations like switching from one cluster to another.

We hope you find it useful! We’ll be looking to add more information to the Heroku log in the future and would love suggestions and feedback on this.

ASCII Protocol Support

We’re happy to announce the general availability of a new feature on MemCachier, the ability to connect to our servers using the memcache ASCII protocol! Our official documentation covers it here.

Previously, this wasn’t supported as the ASCII protocol provides no standardized way to do authentication. We added our own simple scheme. When you establish a connection, the first command you must send to us is a set with your username as the key and your password as the value. For example:

$ telnet 35865.1e4cfd.us-east-3.ec2.prod.memcachier.com 11211

> set 15F38e 0 0 32
> 5
2353F9F1C4017CC16FD348B982ED47D
> STORED

The STORED line indicates successful authentication. After this, you can issue all the familiar memcache ASCII protocol commands! You will need to send the set command fairly quickly though, as otherwise we timeout the connection.

One thing to be careful of is only running this over the internal network of the IaaS provider you are hosting MemCachier with (e.g., AWS EC2). Why? Those networks are secure against eavesdropping and so the unencrypted connection works fine. If running over the Internet, you’ll want to use our TLS support for securing the connection.

Beyond running ad-hoc queries yourself using telnet or similar, supporting the ASCII protocol also opens up the number of clients and frameworks you can use. Just make sure to modify them to send the set command for authentication on first connection. As an example, we created a django caching backend that uses the ASCII protocol but works with MemCachier!

New Analytics Dashboard and Cache Migration

We are happy to announce a new design for the analytics dashboard and the ability to move your cache from one cluster to another in emergencies.

 

Screenshot from 2016-11-15 09-30-30.png

 

In August, we had the biggest incident in MemCachier history which affected two clusters on Amazon in the US-East-1 region simultaneously. During our post-mortem analysis we realized that, in extreme cases like these, a significant number of customers would prefer to move to a different cluster even if it means losing the contents of their cache.

As part of a wide range of improvement to better handle such extreme outages again, we’ve built out and decided to expose to customers the ability to move their cache to a different cluster.

The feature can be found in the redesigned analytics dashboard and is available to all paying customers. When used, the cache will be moved to the least loaded cluster in the same region and data in the old cluster will be flushed. The transition is seamless, meaning the configuration of the memcache client does not change. However, the DNS record change might take up to 3 minutes to propagate. It is important to note that this feature is meant to be used only in emergencies! Please always refer to http://status.memcachier.com to see if the cluster you are on is experiencing problems.

We are also changing how we create DNS records for customers as part of this change. While you used to be able to tell what cluster you were on from the DNS records, now the analytics dashboard will display this information.

One other use of this feature other than as a last resort in downtime, is for creating a second cache and ensuring your two caches are on independent clusters. If the two created caches happen to be on the same cluster, simply move one!

Rest assured that we’ve improved and continue to improve our ability to handle extreme outages without your involvement or loss of any cached data. But, it’s always nice to have a last resort.