Deploy a Flask application on AWS Elastic Beanstalk and scale it with Memcache

Want to deploy a Flask application on Elastic Beanstalk that is ready to scale? We’ll explore how to set up your Elastic Beanstalk environment, hook it up to a database, deploy your application, and finally how to use Memcache to speed it up.

Memcache is a technology that improves the performance and scalability of web apps and mobile app backends. You should consider using Memcache when your pages are loading too slowly or your app is having scalability issues. Even for small sites, Memcache can make page loads snappy and help future-proof your app.

The sample app in this guide can be found here.

Prerequisites

Before you complete the steps in this guide, make sure you have all of the following:

  • Familiarity with Python (and ideally some Flask)
  • An AWS account. If you haven’t used AWS before, you can set up an account here.
  • The AWS CLI installed and configured on your computer.
  • Python, git, and the EB CLI installed on your computer.

Required Versions:

  • Python 3.6
  • Pip 18.0

Since Elastic Beanstalk has specific requirements, if you’re running a different version of Python on your machine, consider using a tool like pyenv.

Create a Flask application for Elastic Beanstalk

Flask is a minimalist framework that doesn’t require an application skeleton. Simply create a Python virtual environment and install Flask like so:

Now that we’ve installed the Flask framework, we can add our app code. Let’s create a task list that allows you to add and remove tasks.

Flask is very flexible in the way you structure your application. Let’s add a minimal skeleton to get started. First, create an app in task_list/__init__.py:

This small sample app will not use the SECRET_KEY, but it’s always a good idea to configure it. Larger projects almost always use it, and it is used by many Flask addons.

Next we’ll create an application.py that runs the task_list app.

Elastic Beanstalk looks specifically for a file named application.py, containing an app labeled application. There are ways to change that but we won’t get into that right now. You can read more about it here.

We also need to set the FLASK_APP environment variable to let Flask know where to find the application. For local development, set all required environment variables in a .env file:

FLASK_APP=application.py
FLASK_ENV=development

To make sure Flask picks up the variables defined in the .env file, install python-dotenv:

Now you can run the app with flask run and visit it at http://127.0.0.1:5000/, but the app doesn’t do anything yet.

Create an Elastic Beanstalk app

Associate your Flask skeleton with a new Elastic Beanstalk app with the following steps:

  1. Use pip freeze to save the output to a file named requirements.txt.

    This file is required in order for Elastic Beanstalk to know what to install during deployment.

  2. Create an .ebextensions folder and add a options.config file:

    We’ll include our environment variables in .ebextensions/options.config:

  3. Initialize a Git repository and commit the skeleton. Start by adding a .gitignore file to make sure you don’t commit files you don’t want to. Paste the following into it:

    venv/
    .env
    
    *.pyc
    __pycache__/
    
    instance/

    Now commit all files to the Git repository:

  4. Create an Elastic Beanstalk repo:

    This will set up a new application call flask-memcache. Then we’ll create an environment to run our application in:

    Notice that we’re adding a MySQL database to our EB environment. You’ll be prompted for a username and password for the database. You can set them to whatever you like.

    Be careful when choosing your password. AWS does not handle symbols very well (! $ @ etc.), and can cause some unexpected behavior. Stick to letters and numbers, and make sure it’s at least eight characters long.

    This will create a AWS Relational Database Service (RDS) instance that is associated with this application. When you terminate this application, the database instance will be destroyed as well. If you need a RDS instance that is independent of your Elastic Beanstalk application, create one via the AWS RDS interface.

    This configuration process will take about five minutes. Go refill your coffee, stretch your legs, and come back later.

We now have an EB environment, but our Flask app is not yet ready to be deployed to EB yet. We will make a few necessary changes later, but first let’s implement some task list functionality.

Add task list functionality

Let’s add a task list to the app that enables users to view, add, and delete tasks. To accomplish this, we need to:

  • Set up the database
  • Create a Task model
  • Create the view and controller logic

Set up a MySQL database

Since our EB environment already has a MySQL database initialized, we’ll need to add a way to connect to it through our app.

While you may want to use PostgreSQL, there is currently a bug in the eb cli that prevents you from creating a PostgreSQL instance in us-east-1 . We currently have an open ticket, and if anything changes, we’ll update this post.

To use our database, we need a few libraries to manage our database connection, models, and migrations:

Don’t forget to freeze your new requirements.

Now we can configure our database in task_list/__init__.py:

This creates a db object that is now accessible throughout your Flask app. The database is configured via the SQLALCHEMY_DATABASE_URI, which constructs the URL we need to connect to our database. Otherwise, it falls back to a local SQLite database. The URL is based on the RDS_* environment variables which are set by Beanstalk automatically whenever a database instance is created. If you want to run the application locally using the SQLite database, you need to create an instance folder:

The database is now ready to use. Save the changes with:

Note that the snippet above imports database models with from . import models. However, the app doesn’t have any models yet. Let’s change that.

Create the Task model

To create and store tasks, we need to do three things:

  1. Create the Task model in task_list/models.py:

    This gives us a task table with two columns: id and name.

  2. Initialize the database and create migrations:

    The new migration can be found in migrations/versions/e3a0124d6fe7_task_table.py (your filename’s prefix will differ).

  3. Apply the migration to your database:

    In order for EB to run the migrations upon deployment, we’ll have to include another config file telling it to do so.

    Inside .ebextensions/task_list.config include:

    To apply the changes locally, you’ll need to run flask db upgrade from your terminal.

Save your changes so far:

Create the task list application

The actual application consists of a view that is displayed in the front end and a controller that implements the functionality in the back end. Flask facilitates the organization of back-end controllers via blueprints that are registered in the main application.

Create a controller blueprint in task_list/task_list.py:

This controller contains all functionality to:

  • GET all tasks and render the task_list view
  • POST a new task that will then be saved to the database
  • Delete existing tasks

Register the blueprint in task_list/__init__.py:

With the controller set up, we can now add the front end. Flask uses the Jinja templating language, which allows you to add Python-like control flow statements inside {% %} delimiters. For our task list view, we first create a base layout that includes boilerplate code for all views. We then create a template specific to the task list.

  1. Create a base layout in task_list/templates/base.html:

  2. Create a view that extends the base layout in task_list/templates/task_list/index.html:

    The view consists of two cards: one that contains a form to create new tasks, and another that contains a table with existing tasks and a delete button associated with each task.

Our task list is now functional. Save the changes so far with:

We are now ready to configure the app to deploy on EB.

Deploy the task list app on Elastic Beanstalk

Deploying the Flask application on EB is easily done by running the deploy command:

You can now open the application and see if it’s working:

Test the application by adding a few tasks. We now have a functioning task list running on Elasitc Beanstalk. With this complete, we can learn how to improve its performance with Memcache.

If you get a 500 error when you open the application, check the logs. They’re located in the EB console in the side menu labeled Logs. Check your ENV variables to make sure they’re set correctly.

Add caching to Flask

Memcache is an in-memory, distributed cache. Its primary API consists of two operations: SET(key, value) and GET(key). Memcache is like a hashmap (or dictionary) that is spread across multiple servers, where operations are still performed in constant time.

The most common use for Memcache is to cache expensive database queries and HTML renders so that these expensive operations don’t need to happen over and over again.

Set up Memcache

To use Memcache in Flask, you first need to provision an actual Memcache cache. MemCachier provides a fast and flexible multi-tenant cache system that’s compatible with the protocol used by the popular memcached software. When you create a cache with MemCachier, you’re provided with one or more endpoints that you can connect to using the memcached protocol, accessing your cache just as if you had set up your own memcached server. So head over to https://www.memcachier.com, sign up for an account, and create a free development cache. If you need help getting it set up, follow the directions here.

There are three config vars you’ll need for your application to be able to connect to your cache: MEMCACHIER_SERVERS, MEMCACHIER_USERNAME, and MEMCACHIER_PASSWORD. You can find these on your analytics dashboard. You’ll need to add these variables to EB.

We can confirm that they’ve been set by running:

You should see your MemCachier env variables, as well as all the previous env variables we’ve set.

Then we need to configure the appropriate dependencies. We will use Flask-Caching to use Memcache within Flask.

Since EB does not play nicely with pylibmc, we’ll also need to upgrade pip and install libmemcached using ebextensions/config files.

Inside .ebextensions/upgrade_pip.config, include:

We’ll also need to add the following to .ebextensions/task_list.config:

Now we can configure Memcache for Flask in task_list/__init__.py:

This configures Flask-Caching with MemCachier, which allows you to use your Memcache in a few different ways:

  • Directly access the cache via get, set, delete, and so on
  • Cache results of functions with the memoize decorator
  • Cache entire views with the cached decorator
  • Cache Jinja2 snippets

Cache expensive database queries

Memcache is often used to cache expensive database queries. This simple example doesn’t include any expensive queries, but for the sake of learning, let’s assume that getting all tasks from the database is an expensive operation.

To cache the Task query (tasks = Task.query.all()), we change the controller logic in task_list/task_list.py like so:

Deploy and test this new functionality:

To see what’s going on in your cache, go back to your MemCachier dashboard. The first time you loaded your task list, you should have gotten an increase for the get miss and set commands. Every subsequent reload of the task list should increase get hits (refresh the stats in the dashboard).

Our cache is working, but there is still a major problem. Add a new task and see what happens. No new task appears on the current tasks list! The new task was created in the database, but the app is serving the stale task list from the cache.

Clear stale data

When caching data, it’s important to invalidate that data when the cache becomes stale. In our example, the cached task list becomes stale whenever a new task is added or an existing task is removed. We need to make sure our cache is invalidated whenever one of these two actions is performed.

We achieve this by deleting the all_tasks key whenever we create or delete a new task in task_list/task_list.py:

Deploy the fixed task list:

Now when you add a new task, all the tasks you’ve added since implementing caching will appear.

Use the Memoization decorator

Our caching strategy above (try to obtain a cached value and add a new value to the cache if it’s missing) is so common that Flask-Caching has a decorator for it called memoize. Let’s change the caching code for our database query to use the memoize decorator.

Fist, we put the task query into its own function called get_all_tasks and decorate it with the memoize decorator. We always call this function to get all tasks.

Second, we replace the deletion of stale data with cache.delete_memoized(get_all_tasks).

After making these changes, task_list/task_list.py should look as follows:

Deploy the memoized cache list and make sure the functionality has not changed:

Because the get_all_tasks function doesn’t take any arguments, you can also decorate it with @cache.cached(key_prefix='get_all_tasks') instead of @cache.memoize(). This is slightly more efficient.

Cache Jinja2 snippets

With the help of Flask-Caching, you can cache Jinja snippets in Flask. This is similar to fragment caching in Ruby on Rails, or caching rendered partials in Laravel. If you have complex Jinja snippets in your application, it’s a good idea to cache them, because rendering HTML can be a CPU-intensive task.

Do not cache snippets that include forms with CSRF tokens.

To cache a rendered set of task entries, we use a {% cache timeout key %} statement in task_list/templates/task_list/index.html:

Here the timeout is None and the key is a list of strings that will be concatenated. As long as task IDs are never reused, this is all there is to caching rendered snippets. The PostgreSQL database we use on EB does not reuse IDs, so we’re all set.

If you use a database that does reuse IDs (such as SQLite), you need to delete the fragment when its respective task is deleted. You can do this by adding the following code to the task deletion logic:

Let’s see the effect of caching the Jinja snippets in our application:

You should now observe an additional get hit for each task in your list whenever you reload the page (except the first reload).

Cache entire views

We can go one step further and cache entire views instead of snippets. This should be done with care, because it can result in unintended side effects if a view frequently changes or contains forms for user input. In our task list example, both of these conditions are true because the task list changes each time a task is added or deleted, and the view contains forms to add and delete a task.

You can cache the task list view with the @cache.cached() decorator in task_list/task_list.py:

The @cache.cached() decorator must be directly above the definition of the index() function (i.e., below the @bp.route() decorator).

Since we only want to cache the result of the index() function when we GET the view, we exclude the POST request with the unless parameter. We could also have separated the GET and POST routes into two different functions.

Because the view changes whenever we add or remove a task, we need to delete the cached view whenever this happens. By default, the @cache.cached() decorator uses a key of the form 'view/' + request.path, which in our case is 'view//'. Delete this key in the create and delete logic in task_list/task_list.py just after deleting the cached query:

To see the effect of view caching, deploy your application:

On the first refresh, you should see the get hit counter increase according to the number of tasks you have, as well as an additional get miss and set, which correspond to the view that is now cached. Any subsequent reload will increase the get hit counter by just one, because the entire view is retrieved with a single get command.

Note that view caching does not obsolete the caching of expensive operations or Jinja snippets. It is good practice to cache smaller operations within cached larger operations, or smaller Jinja snippets within larger Jinja snippets. This technique (called Russian doll caching) helps with performance if a larger operation, snippet, or view is removed from the cache, because the building blocks do not have to be recreated from scratch.

Using Memcache for session storage

Memcache works well for storing information for short-lived sessions that time out. However, because Memcache is a cache and therefore not persistent, long-lived sessions are better suited to permanent storage options, such as your database.

To store sessions in Memcache, you need Flask-Session:

Then, configure Flask-Session in task_list/__init__.py:

Our task list app does not have any use for sessions but you can now use sessions in your app like so:

Clean up

Once you’re done with this tutorial and don’t want to use it anymore, you can clean up your EB instance by using:

This will clean up all of the AWS resources.

Further reading & resources