Deploy a Ruby on Rails Application on AWS Elastic Beanstalk and scale it with Memcache

Sascha ~ May 6, 2019

Tags: AWS, Elastic Beanstalk, Rails, Tutorial

Want to deploy a Ruby on Rails application on AWS Elastic Beanstalk that is ready to scale? We’ll explore how to set up your Elastic Beanstalk environment, hook it up to a database, deploy your application, and finally how to use Memcache to speed it up.

We’ll walk you through creating the application from start to finish, but you can view the finished product source code here.

Memcache is a technology that improves the performance and scalability of web apps and mobile app backends. The results of complex database queries, expensive calculations, or slow calls to external resources can be stored in Memcache that can be accessed via fast O(1) lookups. Even for small sites, Memcache can make page loads snappy and help future-proof your app.

Prerequisites

Familiarity with Ruby (and ideally some Rails).
Ruby and git installed.
An AWS account. If you haven’t used AWS before, you can set up an account here.
The AWS CLI installed and configured on your computer.
The EB CLI installed on your computer.

Install Rails and Bundler

We install Bundler version 1.16.6 because this is what Elastic Beanstalk uses at the time of writing this tutorial. If you want collect a number of different and sometimes cryptic error messages from Elastic Beanstalk feel free to experiment with different Bundler versions.

$ gem install bundler -v '1.16.6'
$ gem install rails

Create a Rails application for Elastic Beanstalk

Use the rails command to generate your app skeleton:

$ rails new rails_memcache
$ cd rails_memcache/

Run the server with

$ rails server

and visit the page on http://localhost:3000/ to encounter a happy Rails family.

Configure the Database

Before creating an Elastic Beanstalk app, we need to setup the database. In your Gemfile, change the line that reads:

gem 'sqlite3'

group :development do
  gem 'sqlite3'
end

group :production do
  gem 'mysql2'
end

To update your Gemfile.lock file, run

$ bundle install --without production

The --without production option will prevent the mysql2 gem from being installed locally.

We also need to configure the production database in the config/database.yml. Replace the production configuration with the following:

production:
  adapter: mysql2
  encoding: utf8
  database: <%= ENV['RDS_DB_NAME'] %>
  username: <%= ENV['RDS_USERNAME'] %>
  password: <%= ENV['RDS_PASSWORD'] %>
  host: <%= ENV['RDS_HOSTNAME'] %>
  port: <%= ENV['RDS_PORT'] %>

Don’t worry about these environment variables as they will be set up automatically once we create the Elastic Beanstalk app.

Create an Elastic Beanstalk app

First, commit your application skeleton (Note: if you use an older version of Rails you might need to create a Git repository first with git init):

$ git add .
$ git commit -m "Initial rails app."

Second, create an Elastic Beanstalk repo:

$ eb init -p 'ruby-2.5-(puma)'  rails-memcache --region us-east-1

This will set up a new application called rails-memcache. Feel free to use a different region.

Make sure to use the same Ruby version as in your local development as Elastic Beanstalk is quite finicky about the version of Ruby and Bundler. To make matters worse, Elastic Beanstalk is not using an up do date Bundler which might cause the following cryptic error:

can't find gem bundler (>= 0.a) with executable bundle (Gem::GemNotFoundException)

The ruby-2.5-(puma) Elastic Beanstalk stack uses Ruby 2.5.5 and Bundler 1.16.6 so this is what we used (down to the minor version). If you prefer, you can also update the bundler on EB.

Third, we’ll create an environment to run our application in:

$ eb create rails-env -db.engine mysql -db.i db.t2.micro

Notice that we’re adding a MySQL database to our EB environment. You’ll be prompted for a username and password for the database. You can set them to whatever you like.

Be careful when choosing your password. AWS does not handle symbols very well (! $ @ etc.), and can cause some unexpected behavior. Stick to letters and numbers, and make sure it’s at least eight characters long.

This will create a AWS Relational Database Service (RDS) instance that is associated with this application. When you terminate this application, the database instance will be destroyed as well. If you need a RDS instance that is independent of your Elastic Beanstalk application, create one via the AWS RDS interface.

This configuration process will take about five minutes. Go refill your coffee, stretch your legs, and come back later.

Creating the Elastic Beanstalk application will likely throw an error because it misses the SECRET_KEY_BASE variable. So as a final step, setup this environment variable:

$ eb setenv SECRET_KEY_BASE=$(rails secret)

Add contact list functionality

Use the Rails scaffold generator to create an interface for storing and viewing a simple directory of names and email addresses:

$ rails g scaffold contact name:string email:string
$ rake db:migrate

Edit config/routes.rb to set contacts#index as the root route,

root :to => 'contacts#index'

Note: In Rails 3 apps you can now delete public/index.html.

Commit the changes and redeploy to Elastic Beanstalk:

$ git add .
$ git commit -m "Added first model."
$ eb deploy

You should now be able to navigate to your app using eb open to view your list of contacts. Follow the “New Contact” link and create a few records.

Add caching to Rails

Memcache is an in-memory, distributed cache. Its primary API consists of two operations: SET(key, value) and GET(key). Memcache is like a hashmap (or dictionary) that is spread across multiple servers, where operations are still performed in constant time.

The most common use for Memcache is to cache the results of expensive database queries and HTML renders so that these expensive operations don’t need to happen over and over again.

Set up Memcache

To use Memcache in Rails, you first need to provision an actual Memcached cache. You can easily get one for free from MemCachier. MemCachier provides easy to use, performant caches that are compatible with the popular memcached protocol. It allows you to just use a cache without having to setup and maintain actual Memcached servers yourself.

There are three config variables you’ll need for your application to be able to connect to your cache: MEMCACHIER_SERVERS, MEMCACHIER_USERNAME, and MEMCACHIER_PASSWORD. You’ll need to add these variables to EB.

$ eb setenv MEMCACHIER_USERNAME=<username> MEMCACHIER_PASSWORD=<password> MEMCACHIER_SERVERS=<servers>

We can confirm that they’ve been set by running:

$ eb printenv

To add the dependency to interact with your Memcache, modify your Gemfile to include dalli, a memcache client library:

gem 'dalli'

Then run

$ bundle install --without production

to install the added gems and update your Gemfile.lock file.

Now configure the default Rails caching to use the cache store provided by dalli by modifying config/environments/production.rb to include:

config.cache_store = :mem_cache_store,
                    (ENV["MEMCACHIER_SERVERS"] || "").split(","),
                    {:username => ENV["MEMCACHIER_USERNAME"],
                     :password => ENV["MEMCACHIER_PASSWORD"],
                     :failover => true,
                     :socket_timeout => 1.5,
                     :socket_failure_delay => 0.2,
                     :down_retry_delay => 60
                    }

To make it easier to see how this example works, temporarily turn off Rails’ built-in caching (we will re-enable it later):

config.action_controller.perform_caching = false

Cache expensive operations

Memcache is often used to cache expensive operations such computations, external API calls, or database queries. This simple example doesn’t include any expensive operations, but for the sake of learning, let’s assume that getting all contacts from the database is an expensive query.

The code in your ContactsController looks something like this:

def index
    @contacts = Contact.all
end

Every time /contacts is requested, the index method will execute and a database query to fetch all of the records in the contacts table is run. Let’s cache the results of Contact.all so that a database query isn’t run every time this page is visited.

The Rails.cache.fetch method takes a key argument and a block. If the key is present, then the corresponding value is returned. If not, the block is executed and the value is stored with the given key, then returned.

In app/models/contact.rb, add the following method to the Contact class:

def self.all_cached
  Rails.cache.fetch('Contact.all') { all.to_a }
end

In app/controllers/contacts_controller.rb change

@contacts = Contact.all

@contacts = Contact.all_cached

Note that we cache all.to_a instead of all. This is because since Rails 4 Model.all is executed lazily and you need to convert Contact.all into an array with to_a in order to cache the actual contacts.

Let’s also display some statistics on the index page. Add the following line to the index method in app/controllers/contacts_controller.rb:

@stats = Rails.cache.stats.first.last

And add the following markup to the bottom of app/views/contacts/index.html.erb:

<h1>Cache Stats</h1>

<table>
  <tr>
    <th>Metric</th>
    <th>Value</th>
  </tr>
  <tr>
    <td>Cache hits:</td>
    <td><%= @stats['get_hits'] %></td>
  </tr>
  <tr>
    <td>Cache misses:</td>
    <td><%= @stats['get_misses'] %></td>
  </tr>
  <tr>
    <td>Cache flushes:</td>
    <td><%= @stats['cmd_flush'] %></td>
  </tr>
</table>

Commit the results and deploy to Elastic Beanstalk.

$ git commit -am "Add caching."
$ eb deploy

Refresh the /contacts page and you’ll see “Cache misses: 1”. This is because you attempted to fetch the 'Contact.all' key, but it wasn’t present. Refresh again and you’ll now see “Cache hits: 1”. This time the 'Contact.all' key was present because it was stored during your previous request.

You can see this effect again if you flush the cache in your MemCachier dashboard.

Expiring the cache

Now that Contact.all is cached, what happens when that table changes? Try adding a new contact and returning to the listing page. You’ll see that your new contact isn’t displayed. Since Contact.all is cached, the old value is still being served. You need a way of expiring cache values when something changes. This can be accomplished with filters in the Contact model.

Add the following code to app/models/contact.rb to the Contact class:

class Contact < ApplicationRecord
  after_save    :expire_contact_all_cache
  after_destroy :expire_contact_all_cache

  def expire_contact_all_cache
    Rails.cache.delete('Contact.all')
  end

  #...

end

Commit these changes and deploy to Elastic Beanstalk:

$ git commit -am "Expire cache."
$ eb deploy

Now you can see that every time you save (create or update) or destroy a contact, the Contact.all cache key is deleted. Every time you make one of these changes and return to /contacts, you should see the “Cache misses” count get incremented by 1.

Built-in Rails caching

The examples above explain how to fetch and expire caches explicitly. Conveniently, Rails builds in much of this functionality for you. By setting

config.action_controller.perform_caching = true

in config/environments/production.rb Rails allows you to do fragment, action, and page caching.

Here we just briefly introduce these caching techniques. For more details and other techniques such as russian doll caching, please refer to the Rails Guide on Caching.

Fragment caching

Pages in Rails are generally built from various components. These components can be cached with fragment caching so they do not need to be rebuilt each time the page is requested.

Our /contacts page for example is built from contact components, each showing the name, the email, and 3 actions (show, edit, and destroy). We can cache these fragments by adding the following to @contacts.each loop in app/views/contacts/index.html.erb:

# ...
<% @contacts.each do |contact| %>
  <% cache contact do %>
    # ...
  <% end %>
<% end %>
# ...

Action caching

In addition to fragments, Rails can also cache the whole page with page and action caching. Page caching is more efficient as it allows to completely bypass the Rails stack but it does not work for pages with before filters, such as authentication. Action caching stores objects and views similar to page caching, but it is served by the Rails stack.

To use action caching you need to add the actionpack-action_caching gem to your Gemfile and run bundle install:

gem 'actionpack-action_caching'

To cache the results of the show action, for example, add the following line in app/controllers/contacts_controller.rb:

class ContactsController < ApplicationController
  caches_action :show
  # ...
end

For proper expiration, add the following line in both the update and destroy methods in contacts_controller.rb

def update
  expire_action :action => :show
  # ...
end

def destroy
  expire_action :action => :show
  # ...
end

Note that even if you use action caching, fragment caching remains important. If a page expires, fragment caching makes sure the whole page does not have to be rebuilt from scratch but can use already cached fragments. This technique is similar to russian doll caching.

Other caching techniques

There are more ways to use caching in a Rails application such as for session storage.

To use your cache for session storage create (Rails 5) or edit (Rails 3 and 4) the file config/initializers/session_store.rb to contain:

# Be sure to restart your server when you modify this file.
Rails.application.config.session_store :cache_store, key: '_memcache-example_session'

Clean up

Once you’re done with this tutorial and don’t want to use it anymore, you can clean up your EB instance by using:

$ eb terminate

This will clean up all of the AWS resources.