MemCachier with Django, Docker and AWS Elastic Container Service
In this tutorial we’re going to look a little beyond the usual “getting started” Django tutorials to look at some of the things you might want to do to deploy applications to production.
We’re going to do the following:
- Start with a very simple demonstration Django application. This will be an image thumbnailer, where you give an image URL and a desired thumbnail size and the application downloads the image from the given URL, generates a thumbnail and returns the result to you as an image.
- Dockerize our application, using Gunicorn as a production-ready web server.
- Deploy the application to Amazon’s Elastic Container Service, which will give us load balancing and container health checking and resilience.
- Add a MemCachier cache to our application to speed it up. (This will involve thinking about managing secrets in ECS, as well as showing how deploying new versions of containers works there.)
The MemCachier content of this tutorial is a relatively small fraction of what we’re going to do, but the idea is to demonstrate a more realistic use case than the usual toy tutorials (our demonstration application is still a toy, but it’s a toy that definitely benefits from caching). There are a number of things you have to consider when doing this sort of deployment that aren’t too obvious, and it’s useful to see the whole setup process from beginning to end.
Prerequisites
We’ll be working in Linux, although the distribution you use shouldn’t matter too much (I use Arch Linux, but Ubuntu or Debian or more or less any other “normal” distribution should be fine too).
You’ll need to have the following things installed (follow the links for installation instructions):
- Some standard things: Python 3, Git, a web browser;
- Docker;
- NPM, Node.js, Firefox and a Java runtime (for some optional performance testing we’re going to do).
You’ll also need an AWS account. If you’ve not used AWS before, you can set up an account here. You get a pile of free resources to use for the first year after sign-up, and if you follow along with what we do in this tutorial, you shouldn’t get charged for anything (if you already have an AWS account, you might get charged a little for the load balancer we’re going to use).
All the code for the tutorial in a GitHub repository here. Clone the repo and we can get started!
Part 1: A simple Django application
There are three versions of our Django application in the top-level of
the tutorial repository. In Part 1, we’ll work on the basic
development version of the application, found in the
thumbnailer-basic
directory.
Setting up and a quick look at our application
We’ll manage Python dependencies with a virtual environment. To install the necessary Python dependencies, do the following:
$ git clone git@github.com:memcachier/django-docker-ecs-tutorial.git
$ cd django-docker-ecs-tutorial/thumbnailer-basic
$ python -m venv venv
$ . venv/bin/activate
$ pip install -r requirements.txt
Once this is done, you can run the application using Django’s development server:
$ cd thumbnailer
$ python manage.py runserver
If you then point your browser at http://localhost:8000
, you’ll see
the UI for our application. Paste in the URL to an image somewhere on
the internet, select a thumbnail size, press the “Thumbnail it!”
button and you’ll get a result page with some information about the
image and a thumbnail.
Application structure
The contents of the thumbnailer-basic
directory are a normal Django
application (plus a couple of extras for deployment that we’ll take
about later). Here’s what we have (things in parentheses are standard
Django things we won’t talk about):
.
├── requirements.txt Python dependencies
└── thumbnailer Django project directory
├── main Single Django app
│ ├── (apps.py)
│ ├── forms.py URL+size form definition
│ ├── (__init__.py)
│ ├── templates
│ │ └── main
│ │ ├── index.html Input form template
│ │ └── result.html Thumbnail result template
│ ├── utils.py Thumbnailing worker function
│ └── views.py Single input+result view
├── (manage.py)
└── thumbnailer
├── (__init__.py)
├── (settings.py)
├── (urls.py)
└── (wsgi.py)
We have a top-level thumbnailer
directory, containing global
configuration in the thumbnailer
sub-directory, and a single main
Django app that does all the work.
There are a couple of things that most Django application have that we don’t need here: we don’t have any models, so we don’t need a database, and we don’t use any authorization. That means that our settings file is simplified a little. For the purposes of this tutorial, we’re going to be running everything in debug mode, although you obviously shouldn’t do that in production for real applications.
Our application has a single URL (“/
”) and a single class-based form
view to implement it. Here’s what’s in views.py
:
from django.http import HttpResponseBadRequest
from django.shortcuts import render
from django.views.generic.edit import FormView
import requests
import time
from .forms import ThumbnailForm
from .utils import make_thumbnail
class ThumbnailView(FormView):
= 'main/index.html'
template_name = ThumbnailForm
form_class = '/'
success_url
def form_valid(self, form):
= time.time()
start try:
# Extract original image URL and request thumbnail size
# from form data.
= form.cleaned_data['image_url']
url = int(form.cleaned_data['thumbnail_size'])
sz
# Generate thumbnail data URL.
= make_thumbnail(url, sz)
data, orig_size
# Render result view.
= time.time() - start
duration return render(self.request, 'main/result.html',
'url': url, 'orig_size': orig_size,
{'duration': int(duration * 1000),
'thumbnail_data': data})
except requests.exceptions.RequestException:
return HttpResponseBadRequest("<p>Can't open image URL!</p>")
except IOError:
return HttpResponseBadRequest("<p>Can't process input image!</p>")
This view renders a form with inputs for a URL and a thumbnail size:
from django import forms
class ThumbnailForm(forms.Form):
= forms.URLField(label='Image URL')
image_url = forms.ChoiceField(
thumbnail_size ='Thumbnail size (largest dim.)',
label=[(64, '64'), (128, '128'), (256, '256')]) choices
When valid form input is POSTed, we pass the URL and thumbnail size to
a worker function (defined in utils.py
), then render the result
using the main/result.html
template (which, as well as the thumbnail
image, shows some other information, including the request processing
time).
To keep things as simple as possible, the resulting thumbnail image is
rendered using a data URI for its image content. The
function that does the work of generating the thumbnail downloads the
original image from the supplied image URL, creating a Pillow image
from it, then uses the Image.thumbnail
method to resize it. The
generated thumbnail image is then extracted as a PNG and
Base64-encoded to make a data URI:
import requests
import base64
from PIL import Image
from io import BytesIO
def make_thumbnail(url, sz):
# Download original image URL.
= requests.get(url)
r
# Create PIL image from in-memory stream.
= Image.open(BytesIO(r.content))
img
# Generate our thumbnail image.
= img.size
orig_size
img.thumbnail((sz, sz))
# Using another in-memory stream...
with BytesIO() as fp:
# Extract the image as a PNG.
format='PNG')
img.save(fp,
# Base-64 encode the image data (and convert to a Python
# string with "decode").
= base64.b64encode(fp.getbuffer()).decode()
b64
# Make the final data URI, showing that it's Base-64 encoded
# PNG data.
= 'data:image/png;base64,' + b64
data return data, orig_size
All of the image processing is done in memory, using BytesIO
streams
to move data around. The request to the original image URL is done
synchronously using the requests
package, which means that the
server thread servicing this request will be blocked while the image
is downloaded. The time taken to download the original image is the
main contribution to the request service time, which is displayed in
the results page when the thumbnail is rendered. With my internet
connection, for most smallish images, the total request service time
varies between about 500 ms and about 1500 ms, depending on how my
internet connection is behaving and depending on how fast the server
hosting the original image is. This sort of thing is a good candidate
for caching, as we’ll see later!
Performance measurement
Later on, we’re going to want to see how much of a difference caching
makes to the performance of our thumbnailing application. I’ve written
some Javascript code to do this, using webdriver.io
, which
is a test framework that uses Selenium to drive web
browsers. It’s normally used for end-to-end testing of web
applications, but it can be slightly abused for this kind of
benchmarking too. It’s just about the simplest way to use Selenium
that I’ve found. The code is in the benchmarking
directory in the
repository.
The benchmarking code is a standard Node.js application. You can
install the Javascript dependencies by doing npm install
in the
benchmarking
directory, then the benchmarks are run by doing npm test
(optionally giving a URL to connect to in the TEST_URL
environment variable, which defaults to http://localhost:8000
to
connect to the Django development server).
When the benchmarks are run, they perform 100 thumbnail requests via
automation of 10 instances of Firefox, choosing image URLs randomly
from a small list of online images. The results are written to a
results.json
file, and contain information about the image URL and
size requested, the server that processed the request and the time
taken both on the server and in total for the request.
We’ll use this code to compare performance of the non-caching and caching code deployed to ECS later on.
Part 2: Deploying to Amazon ECS
Getting our application working in a development setup is only half
the battle. We need to come up with a good way to deploy our
application on production infrastructure. That means thinking about
performance, resilience and monitoring. First of all, we shouldn’t use
Django’s development server for serving our application in production.
There are a number of better options, and we’re going to use Gunicorn,
a high-performance Python server suitable for production deployments.
Once we have our application working with Gunicorn, we’ll make some
small changes needed for operation behind an AWS load balancer
(basically just making sure that Django’s ALLOWED_HOSTS
variable is
set correctly). Then we’ll package the application up as a Docker
container (we’ll “dockerize” it). This container can then be deployed
to Amazon ECS (although we’ll also demonstrate it running locally
first).
The code used in this part of the tutorial is in
thumbnailer-ecs-no-caching
directory of the tutorial repository.
Running with Gunicorn
Python web applications usually support an interface called WSGI (Web
Server Gateway Interface), which defines a interface for web
servers to serve web applications. Django supports WSGI and provides a
WSGI application stub in the wsgi.py
file it generates as part of
setting up a new project.
WSGI applications can be served by any web server that supports the WSGI interface: we’re going to use Gunicorn which is simple to set up and has pretty good performance.
Why not just use the Django development server in production anyway? Well, it’s single-threaded and mostly optimised for quick restarts when we change our code during the development process. A production web server will have multiple server threads (or processes, depending on exactly how it’s set up) and will be optimised just for serving HTTP requests as quickly as possible.
To run Gunicorn, we need to make sure it’s installed (using pip install
then putting the relevant versioned dependencies in the
requirements.txt
file), then run the gunicorn
executable with
command-line arguments to configure the server and pass it our WSGI
application.
The command we’re going to use is
gunicorn --bind=0.0.0.0:8000 \
--workers=3 --worker-class=gevent \
--access-logfile '-' --error-logfile '-' --capture-output \
thumbnailer.wsgi
which we’ll put into a startup script (start.sh
) in the
thumbnailer
directory and make it executable via chmod +x start.sh
.
The command-line arguments we use here do the following:
bind
: which address and port to listen on;workers
: the number of worker processes to spawn;worker-class
: the type of worker to use – see below;access-logfile
,error-logfile
,capture-output
: arguments to force Gunicorn to log all errors and access to standard error and to pass Django output to the same place.
Finally, we pass the name of the Python module containing the WSGI application.
The worker-class
argument specifies that we want to use asynchronous
worker processes based on the gevent
library. Since each request to
the thumbnailer triggers an HTTP download from the image URL we pass
in, requests can take arbitrarily long to process. As the Gunicorn
documentation says:
The default synchronous workers assume that your application is resource-bound in terms of CPU and network bandwidth. Generally this means that your application shouldn’t do anything that takes an undefined amount of time. An example of something that takes an undefined amount of time is a request to the internet. At some point the external network will fail in such a way that clients will pile up on your servers. So, in this sense, any web application which makes outgoing requests to APIs will benefit from an asynchronous worker.
(To use the gevent
worker type, we also need to install the gevent
and greenlet
packages and add them to requirements.txt
.)
Once this is all set up, Gunicorn can be run just by running the
start.sh
script. If you do this, you’ll be able to access the
thumbnailer at http://localhost:8000
as normal, and if you look at
the processes running (do something like ps -wweFH
and scroll back
through your terminal to look for the process running the start.sh
script), you’ll see that Gunicorn has started three worker processes
to service request.
ALLOWED_HOSTS
setup
The Django ALLOWED_HOSTS
setting variable is
used to constrain the server names for which a Django application will
service requests, and prevents some kinds of attacks on our web
server. Running in development, we don’t need to worry about this
setting, but for production deployments, we need to set it correctly.
Normally, if you’re hosting a production web application at
http://www.my-cool-company.com
, you would just put
www.my-cool-company.com
in ALLOWED_HOSTS
and everything would be
fine. We don’t have a DNS domain to use for our deployment though, so
we need to do things a little differently. We’re going to be using an
AWS Elastic Load Balancer (ELB) to distribute HTTP requests across a
few instances of our application, and all ELB endpoints have names of
the form *.elb.amazonaws.com
, so we can just put
.elb.amazonaws.com
in ALLOWED_HOSTS
– this isn’t perfectly
secure, but is good enough for a demonstration.
There is one additional wrinkle here, in that we need to support the
health check requests that the load balancer sends to check that our
application instances are alive. (If these health checks fail often
enough, the load balancer kills and restarts our container instances.)
The problem is that these health check requests are made directly to
the private IP addresses of the EC2 instances hosting our containers,
not to the .elb.amazonaws.com
name (since the health check requests
come from behind the load balancer endpoint). We can deal with this
by adding the private IP address (which will lie in the 10.0.0.0 range
we’re going to use in the virtual private cloud we set up during ECS
configuration) to ALLOWED_HOSTS
.
We can do this by adding the following bit of code to our
settings.py
:
import requests
# ...
= []
ALLOWED_HOSTS try:
= requests.get('http://169.254.170.2/v2/metadata',
metadata =0.1).json()
timeout= metadata['Containers'][0]['Networks'][0]['IPv4Addresses'][0]
ip = ['.elb.amazonaws.com', ip]
ALLOWED_HOSTS except requests.exceptions.ConnectionError:
pass
This works by accessing the task metadata HTTP
endpoint implemented by the container agent that
AWS runs on EC2 instances hosting containers managed by ECS. This
returns a JSON object containing all sorts of useful information,
including the IP address of the container instance. Adding this to
ALLOWED_HOSTS
allows Django to service the health checks from the
load balancer.
Dockerizing
Once we’ve reorganised our Django code as described above for production operation, we can create a Docker container for it. To do this, we obviously need to have Docker installed and working.
Docker works by building container images from layers using a type
of union filesystem to reduce data transfer requirements when
containers are updated. We define the process of creating a container
using a Dockerfile
which is a simple script that sets up the
container contents. The format of the Dockerfile
is documented
here, but we only need a few features. Here’s the
entirety of our Dockerfile
:
# Build from default Python image (don't use "django" base image: it's
# obsolete).
FROM python:3.6.5
# This is the port that Gunicorn uses, so expose it to the outside
# world.
EXPOSE 8000
# Don't just add the base code directory, to try to cut down on
# rebuild sizes.
RUN mkdir /thumbnailer
WORKDIR /thumbnailer
# Install all the requirements -- doing things in this order should
# reduce the size of redeployments if we don't change dependencies.
ADD requirements.txt /thumbnailer/
RUN pip install -r requirements.txt
# Add the main code directory and point at the start script.
ADD . /thumbnailer
WORKDIR /thumbnailer/thumbnailer
CMD ["./start.sh"]
We start from a base image, which will be downloaded from a well-known Docker repository (the Docker Hub). We use an image with a suitable version of Python pre-installed. These base images are minimal operating system images suitable for use as containers. Any additional software you need in your container, you will need to install yourself.
The main things you do in a Dockerfile
are to expose container ports
to the outside world (we expose port 8000, which is the port that our
start.sh
tells Gunicorn to serve our application on), to import
files from the host file system into the container (here, we use the
ADD
command to do that), to run shell commands within the container
as it’s being set up (using the RUN
command) and to establish an
entry point for the container (i.e. the single process that will run
when the container is launched – we use the CMD
command to do
this).
You don’t really need to know too much about how Docker works to use
it, but it’s useful to know a little bit to help with optimising the
way that container images get built. It’s easy to end up with a
situation where you need to upload large amounts of data to deploy
even trivial changes to your code. A bit of care can avoid that. The
most important thing to know is that Docker treats each line in your
Dockerfile
as a command to create a new container image based on the
result of the previous line. If Docker can satisfy itself that there
have been no changes that affect the results of a line in the
Dockerfile
since the last time that the image was built, then it can
reuse the results of the last build.
The things we need to do to set up our container image here are:
- Get the base Python image.
- Expose the port we want to use.
- Install all the Python package dependencies in our
requirements.txt
file into the container image. - Import all of the code for our Django application into the container image.
- Set up the Gunicorn
start.sh
script as the entry point for the container image.
The key thing to notice here is that if we make a change to our code,
but don’t change the Python packages we depend on, then Docker
shouldn’t need to repeat steps 1-3 of this process, just reusing the
results from the last build. That’s why we bring the
requirements.txt
file into the container image with a separate ADD
command early in the Dockerfile
: that gives us everything we need to
install the Python dependencies, and makes a container image at this
point in the build process that does not depend on the rest of our
Python source files. We can change other things about our application
and the Python dependency installation step won’t need to be rerun.
That might not seem so important for the image build process, which is relatively quick anyway, but when it comes to uploading container images for deployment and redeployment after code changes, it makes a very big difference. Because of this step-wise layered way that Docker works, if we can set things up so that only the final steps of the image build process change when we make code changes, only those layers of the resulting container image will need to be uploaded when we redeploy our container image. We’ll see in detail how this works later!
There’s one more thing that helps to optimise container image builds,
which is the .dockerignore
file. This is much like .gitignore
for
Git, giving a list of file patterns that Docker should ignore when
processing ADD
(and similar) commands. We can use it to exclude
virtual environment directories, Python compiled code caches, and so
on.
With our Dockerfile
set up, we can build and tag a container image
by saying
$ docker build -t thumbnailer .
in the directory containing the Dockerfile
. This takes a little
while the first time you run it, because it needs to download the base
Python container image. Once it’s done, you can run the docker images
command to see a list of installed container images, and the
thumbnailer
image will be among them. In Docker, images and
containers are usually identified by SHA hashes, but the -t
argument
to the build command above applies a tag to the resulting image,
which is a little easier to deal with.
Once we’ve built the image, we can run it by saying
$ docker run --publish=8001:8000 thumbnailer:latest
This command starts a container from the image we’ve built and
publishes port 8000 from the container (which is where Gunicorn is
serving the thumbnailer application) as port 8001 on the host, so we
can now access the application by pointing a browser at
http://localhost:8001
.
We can take a look at what’s running in the container in another terminal.
First, we do docker ps
to get a list of running containers, resulting in
output that looks something like this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
81e29426f197 thumbnailer:latest "./start.sh" 11 seconds ago Up 7 seconds 0.0.0.0:8001->8000/tcp gallant_swirles
We can execute another process in a running container using Docker’s
exec
command. If we do this:
$ docker exec -it <container-id> bash
where <container-id>
is the hash of our running container from the
docker ps
output (here, 81e29426f197
), we get a shell running
inside the container (the -it
flags tell Docker that this is an
interactive task and that it should allocate a pseudo-TTY inside the
container, both of which are necessary for an interactive shell). In
the shell running inside the container, we can then do:
root@81e29426f197:/thumbnailer/thumbnailer# ps -eFH
UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
root 20 0 0 4988 3588 1 13:03 pts/0 00:00:00 bash
root 29 20 0 9597 3292 3 13:04 pts/0 00:00:00 ps -eFH
root 1 0 0 4929 3096 3 13:02 ? 00:00:00 /bin/bash ./start.sh
root 8 1 0 30780 24992 1 13:02 ? 00:00:00 /usr/local/bin/python /usr/local/bin/gunicorn --bind=0.0.0.0:8000 --workers=3 --worker-class=gevent --access-logfile - --error-logfile - --capture-output thumbnailer.wsgi
root 11 8 0 65717 40336 1 13:02 ? 00:00:00 /usr/local/bin/python /usr/local/bin/gunicorn --bind=0.0.0.0:8000 --workers=3 --worker-class=gevent --access-logfile - --error-logfile - --capture-output thumbnailer.wsg
root 12 8 0 65716 40340 3 13:02 ? 00:00:00 /usr/local/bin/python /usr/local/bin/gunicorn --bind=0.0.0.0:8000 --workers=3 --worker-class=gevent --access-logfile - --error-logfile - --capture-output thumbnailer.wsg
root 13 8 0 65730 40424 3 13:02 ? 00:00:00 /usr/local/bin/python /usr/local/bin/gunicorn --bind=0.0.0.0:8000 --workers=3 --worker-class=gevent --access-logfile - --error-logfile - --capture-output thumbnailer.wsg
(The flags to ps
cause it to display all running processes, showing
their full command lines, and laid out in a process tree.) As you can
see, apart from the shell and the ps
process, the container contains
only the parent Gunicorn process and its three child worker
processes.
And finally, we can stop the container using
$ docker kill <ID>
with the ID of the container.
Now that we have a container image built, we need to put it into a repository that’s accessible to AWS. AWS has a container registry service itself (called ECR, Elastic Container Repository) that provides you with a private container repository associated with your AWS account, and for production containers you might want to use this (because it’s private). For this tutorial, we’re going to use the public Docker Hub container registry though, mostly because it will require smaller file uploads – your ECR repository is completely empty to start with, so Docker has to upload all of the fixed operating system container layers as well as the layers that are specific to your application. Using Docker Hub, most of those fixed layers are cached, so we should only need to upload a smaller amount of data.
You’ll need to create an account at https://hub.docker.com. For this
tutorial, I’ll use my personal account, which has account name
skybluetrades
. Once you’ve created an account and logged in, you’ll
be presented with a list of your repositories (initially empty). Click
on the “Create Repository” button to create a new public repository,
and call it thumbnailer
. (The full name you’ll use to refer to the
repository also includes your Docker Hub user name, so for me, it’s
skybluetrades/thumbnailer
.)
We can now build a container image suitably tagged for uploading to
Docker Hub – in the thumbnailer-ecs-no-caching
directory, do the
following (replacing skybluetrades
with your own Docker Hub
username):
$ docker build -t skybluetrades/thumbnailer .
Then, to upload the container image to Docker Hub, we first log in:
$ docker login -u skybluetrades
again using your own user name, and then supplying the requested password. Now you can push the container image by saying:
$ docker push skybluetrades/thumbnailer
This will take a while, but you will see a number of messages saying
“Mounted from library/python”, showing that many of the layers of our
image are already present on Docker Hub so don’t need to be uploaded.
Once the upload is complete, if you now look under the “Tags” tab in
the repository page on Docker Hub for your thumbnailer
repository,
you’ll see that there is an image tagged “latest
”.
ECS setup
We’re now going to deploy our Dockerized application on Amazon’s Elastic Container Service (ECS). There is quite a lot of setup involved in this, but the payoff is pretty good: we’ll end up with a scalable and load balanced deployment with integrated logging where we can easily deploy new versions of our application at the click of a button.
The first time you log into the AWS Management Console, it can seem a little overwhelming. There are dozens of different services available on the platform, lots of acronyms and specialised vocabulary, and it can be hard to know where to start. I’ve added an appendix to give a quick and minimal explanation to the parts of the AWS infrastructure we’re going to use. Whenever the instructions say “the EC2 console”, “the ECS console”, “the CloudWatch console” and so on, just go the main AWS console page and click on the link for the appropriate service to get to the service-specific console page.
ECS is a technology that melds a number of AWS components to simplify deploying and managing container-based applications. (It might not look like it’s simplifying much as we work through setting it up, but it’s quite a lot more complicated to manage all the AWS resources ECS controls by hand.)
There are three things we need to do:
- We need to create an ECS cluster to host our application. There are two options here: we can launch EC2 instances ourselves and use them to host our application via ECS, or we can use a technology called AWS Fargate to have ECS manage EC2 instances for us. We’re going to use Fargate, as it ends up requiring us to make less decisions about things. Fargate is what people often call a “serverless” technology, which doesn’t mean there are no servers, just that someone else manages the servers for us…
- Once we have an ECS cluster, we need to define an ECS task definition based on our container image and launch it into the cluster.
- Finally, we create an ECS service that will run multiple copies of our task behind a load balancer.
We’re going to do all of this setup through the AWS console.
Creating an ECS cluster
We want to use the AWS Fargate technology for our cluster. This
technology is only supported in a subset of AWS regions at the moment,
so we need to work in one of us-east-1
(North Virginia), us-east-2
(Ohio), us-west-2
(Oregon) or eu-west-1
(Ireland). We’ll use
eu-west-1
for the tutorial, and if you’re following along, you
should use either us-east-1
, us-west-2
or eu-west-1
, since these
are the regions where MemCachier caches are available. (When setting
up the ECS cluster, you’ll be able to tell if you’re not using a
suitable region because the “Powered by AWS Fargate” option just won’t
be available.)
- Log in to AWS and switch to one of the regions listed above (using the region menu near the top right of the page).
- Go to the ECS console (by selecting the ECS service from the main AWS console service list), click on “Clusters” in the left sidebar menu, then click on the “Create Cluster” button.
- Select the “Networking only / Powered by AWS Fargate” cluster template (you won’t see this option if you’re trying to use a region that doesn’t support Fargate). Click “Next step”.
- On the “Configure cluster” page, set the cluster name to
“
thumbnailer-cluster
”, enable the “Create VPC” option and accept the defaults for the IP address range and subnets. - Finally, click the “Create” button to create the cluster.
The cluster creation process uses an AWS service called CloudFormation to manage setting up networking resources. It takes a minute or two for the new VPC and its associated resources to be configured. Once the setup is complete, you can go the CloudFormation console where you’ll see the resources that were created (looking at this sort of thing can be a good way to learn how to set up AWS resources).
So now we have a cluster, but no tasks or services yet.
Creating an ECS task definition
Before we can create a service to run tasks in our cluster, we need to create a task definition.
- From the ECS console left-hand sidebar, click on “Task Definitions” then click on “Create new Task Definition”.
- Select the “FARGATE” launch type compatibility and click “Next step.”
- Fill in the configuration details:
- Task Definition Name:
thumbnailer-task
; - Task Role: Select the
ecsTaskExecutionRole
from the dropdown; - Network Mode:
awsvpc
(you can’t change this); - Task execution role:
ecsTaskExecutionRole
; - Task memory: 0.5 Gb (the minimum);
- Task CPU: 0.25 vCPU (the minimum).
- Task Definition Name:
- Under “Container Definitions”, click “Add container” and fill in
the container details:
- Container name:
thumbnailer-container
; - Image:
skybluetrades/thumbnailer:latest
(replaceskybluetrades
with your Docker Hub user name); - Soft Memory Limits: 256 Mb;
- Port mappings: 8000 TCP (you only need to specify the internal container port, since ECS deals with assigning external ports).
- Container name:
- Click the “Create” button to create the task definition.
This all does a couple of different things: it creates the task
definition, which is what ECS uses to associate our container image
with resources to run the image (we choose options to use more or less
the minimum possible computational resources here); it sets up
security roles that allow ECS to start EC2 instances for you as
required; and it creates a logging group in the AWS CloudWatch logging
service to aggregate log output from all the tasks we run from this
task definition. (You can see this log group by going to the
CloudWatch console and selecting “Logs” from the left-hand sidebar.
The log group for our task definition is called
/ecs/thumbnailer-task
. It doesn’t have anything in it yet, but when
ECS starts tasks, the logs from them will go into individual log
streams within the /ecs/thumbnailer-task
log group, where we can
view them, create rules and alarms based on them, and so on.)
Creating an ECS service
The last step to getting our container image running on ECS is to create an ECS service.
- On the ECS clusters list, click on the
thumbnailer-cluster
entry, then click on “Create” under the “Services” tab. - Configure the service:
- Launch type: FARGATE;
- Task Definition:
thumbnailer-task
; - Cluster:
thumbnailer-cluster
; - Service name:
thumbnailer-service
; - Number of tasks: 3.
- On the “VPC and security groups” section:
- Select the VPC created as part of the cluster (it should already be selected, but you can check by looking at the CloudFormation stack that ECS created for the cluster);
- Enable both subnets of the VPC.
- Under “Security groups”, choose “Edit” to modify the ingress
rules for the service security group: change the security group
name to
thumbnailer-sg
and change the single existing ingress rule to be a “Custom TCP” for port 8000 (the port our container image exposes).
- In the “Load balancing” section, select “Application Load
Balancer”. Open the link that appears to the EC2 Console in a new
browser tab to create a new load balancer. The link will open at
the start of the load balancer creation wizard. From there:
- Click on “Create” under “Application Load Balancer” and fill in the configuration details on the next page;
- Name:
thumbnailer-lb
; - Scheme:
internet-facing
; - Listeners: accept the default HTTP listener;
- Under “Availability Zones”, select the ECS cluster VPC and enable both subnets (which are in different AZs);
- Click to go to the next page, then click again (there’s nothing to do on the “Configure Security Settings” page);
- On the “Configure Security Groups” page, select “Create a new
security group”, give the new security group the name
thumbnailer-lb-sg
and make sure it has a single ingress rule for HTTP traffic (port 80); - On the “Configure Routing” page, select “New target group”, give
the target group the name
thumbnailer-tg
, change the “Target type” to “ip
”, and accept all other settings; - Skip over the “Register targets” page: ECS will be responsible for managing the instances to our load balancer target group as it starts and stops them – we don’t need to anything here.
- Once the load balancer creation has completed, return to the browser tab with the ECS service creation wizard, press the refresh button next to the “Load balancer name” field and select the load balancer you just created.
- In the “Container to load balance” section, select the
thumbnailer-container
and click on “Add to load balancer”. Choose “80:HTTP” for the “Listener port” and select the target group we created for the load balancer as the “Target group name”. Disable service discovery, accept defaults for everything else and press “Next step”. - Skip the autoscaling setup page, review the configuration on the final page, then click on “Create Service”.
Once the service creation process is complete, you should be able to
go to the thumbnailer-service
view from the thumbnailer-cluster
page of the ECS console. Open the “Tasks” tab, press the refresh
button a few times, and you should see the service tasks starting.
Once the service tasks have started, you can look at their output in
the logs: go to the CloudWatch console, choose “Logs” from the
left-hand sidebar, then select the /ecs/thumbnailer-task
log group.
There should be one log stream for each of the three tasks we started.
If you click on one of them, you should see first the startup messages
from Gunicorn, then Apache-style access logs for each task. We’ve not
connected to the tasks yet, but you should see requests coming into
the root URL from the ELB health checker, which is the process that
decides whether the instances behind the load balancer are healthy or
not.
We can find the URL for our application under the information for the
thumbnailer-lb
load balancer, which is accessible from the EC2
console (“Load Balancers” from the sidebar). The “DNS name” field
gives a name that can be used to access the application: point your
browser at that and you can do some thumbnailing in our containers
running on ECS.
If you do a few examples via the load balancer DNS name, you’ll see that the server hostname changes from request to request, providing some evidence that we really are load balancing.
What can go wrong…
There are a few things that can go wrong with this setup. A couple are obvious errors, like putting resources that should be in the same VPC on different VPCs, but a bigger potential problem is a little more subtle.
We have two security groups involved in what’s going on here, one for
the load balancer (thumbnail-lb-sg
) and one for the container
(thumbnailer-sg
). The load balancer is what faces the outside world,
and so should have its HTTP port (port 80) open to public traffic. The
other, container, security group, lives behind the load balancer and
is not publicly accessible. It should have an ingress rule for the
port used by the container, which is port 8000, not the default HTTP
port.
If you get this wrong, nothing will work because traffic from the load balancer to the containers will be silently refused. The easiest way to diagnose this problem is to look at the load balancer target group’s status page (available from the EC2 console). Under the “Targets” tab, this usually lists instances that are part of the target group. If no traffic is getting through to the container instances, it will say something like “no healthy instances in this target group”. Any time you see this, when ECS is claiming that it is successfully starting tasks, you should take a look at the security group rules to make sure that you aren’t silently discarding all traffic from the load balancer to the application containers.
Part 3: Caching
Finally, we get to caching, which is what we mostly do at MemCachier. What we’re going to do in this final part of the tutorial is to create a MemCachier cache on AWS infrastructure in the same region as our application containers, to make the necessary code changes to our application to make use of the cache, to think about how to manage the secrets we need to use to connect to the cache, then to deploy our updated application. We’ll finish off by looking at some benchmarking results to prove that our cache is working.
The code for this part of the tutorial is in the
thumbnailer-ecs-caching
directory of the repository.
Using MemCachier for caching
MemCachier provides a fast and flexible multi-tenant cache system
that’s compatible with the protcol used by the popular
memcached
software. When you create a cache with
MemCachier, you’re provided with one or more endpoints that you can
connect to using the memcached
protocol, accessing your cache just
as if you had set up your own memcached
server. We’ve tried to make
MemCachier as easy as possible to use. You need to go to
https://www.memcachier.com and sign up for an account, then you can
create a free development cache:
- On the welcome page after confirming your email address for your new MemCachier account, press the “Create A Cache” button.
- Choose a name for the cache (
thumbnailer-cache
, maybe?). - Select “Amazon Web Services (EC2)” as the provider for the cache.
- Leave the cache plan as the default, a free 25 Mb development cache.
- Select the AWS region where you’ve set up your Django application from the dropdown.
- Press the “Create cache” button.
Once the cache is created, you can retrieve the server endpoint to use to connect to the cache, and the username and password to use for authentication from the cache details displayed on the “You Caches” page. Clicking on the “Analytics dashboard” button for the cache takes you to a page where you can view connection information and other statistics for your cache. (The statistics viewable for development caches are restricted, but production caches show constantly updating information about cache size usage, hit rate and eviction rates.)
Apart from better statistics management, the other main difference between development and paid production caches is that larger production caches provide multiple independent connection endpoints for load balancing and resiliency.
Caching in Django
To make use of a cache in Django, we need to add a
cache backend. This goes into the settings.py
. There are a number of
options for cache backends, but we’re going to use the
python-binary-memcached
and
django-bmemcached
packages. The reason for
choosing these packages is that, first of all, we need a package that
supports the binary memcached protocol which includes SASL
authentication. (We do support the ASCII version of the memcached
protcol, but authentication there requires a non-standard extension to
the protocol.) Second, we don’t really want to use the standard
pylibmc
package because it depends on an external C library. This
isn’t a problem as such, but it would complicate building our Docker
container image slightly, so we go with a pure Python solution
instead.
After including the caching packages in our requirements.txt
, we can
add the following cache setup code to our settings.py
:
if os.environ.get('MEMCACHIER_SERVER') is not None:
print('Thumbnailer running WITH caching...')
= {
CACHES 'default': {
'BACKEND': 'django_bmemcached.memcached.BMemcached',
'LOCATION': os.environ['MEMCACHIER_SERVER'],
'OPTIONS': {
'username': os.environ['MEMCACHIER_USERNAME'],
'password': os.environ['MEMCACHIER_PASSWORD']
}
}
}else:
print('Thumbnailer running WITHOUT caching...')
Here, we extract the cache connection details from environment
variables (MEMCACHIER_SERVER
, MEMCACHIER_USERNAME
and
MEMCACHIER_PASSWORD
). We’ll see how to get values into those
environment variables on ECS in a moment. If have caching set up
(detected by the presence of a valid MEMCACHIER_SERVER
value), set
Django’s CACHES
configuration variable up to use the binary
memcached
protocol backend provided by the django-bmemcached
package. To do this we need to supply the MemCachier endpoint and
login credentials.
That’s all that’s needed to allow Django to connect to the MemCachier cache we created. There are a number of different ways that a Django web application can make use of a cache. The most common approaches are to cache whole rendered pages or fragments of pages, or to cache the results of database queries. Those options are less relevant to our thumbnailer application, so we’re going to demonstrate an approach using application-dependent custom caching. You can do more or less anything with this that you like – for example, I’ve used it in the past for caching the results of complicated permissions policy calculations to avoid recalculations every time permissions need to be checked.
To make use of the cache for storing the results of our thumbnail
generation, we need to make only a few small changes to the views.py
file in the main
Django app. We add an import line at the top of
the file to get access to the main Django cache:
from django.core.cache import cache
then we modify the call to the thumbnail generation code to check for
the presence of a pre-rendered thumbnail in the cache before calling
the make_thumbnail
function:
# Get thumbnail image data from cache if possible.
= str(sz) + ':' + url
key = cache.get(key)
data if data is None:
# Generate thumbnail data URL and add to cache.
= make_thumbnail(url, sz)
data set(key, data) cache.
We construct a key for the cache entry from the thumbnail image size
and the original image URL. If the rendered thumbnail is in the cache,
we use that data directly. Otherwise we render the thumbnail using the
make_thumbnail
function as before, and store the result in the cache
for later use.
What this means is that, until we fill our cache up, we only ever render any individual thumbnail once and after that always access the rendered thumbnail image data from the cache. Once our cache fills up, MemCachier will evict items in least recently used (LRU) order, but we don’t need to do anything special to manage that.
In more complex applications, we might want to set expiry times on our cached items, or explicitly delete items that we know are no longer valid, but for our thumbnailer, we just assume that image URLs are immutable and the thumbnails we produce are valid forever.
Managing secrets on ECS
Before we deploy our application with caching to ECS, we need to deal
with the issue of secrets. We need some way to get values into the
MEMCACHIER_SERVER
, MEMCACHIER_USERNAME
and MEMCACHIER_PASSWORD
environment variables for our Django application to use to connect to
our MemCachier cache. There are a few different ways to do this, but
we’re going to demonstrate an approach that uses some AWS services,
and that is both secure and more scalable and manageable than any ad
hoc approach we might come up with ourselves.
The basic principle we’re going to follow here is that we want to decouple the management and deployment of code (our container image) from the management and access of secrets (login credentials for MemCachier, in this case, but the same approach applies for database credentials, encryption keys, and any other critical security data).
In general, you should not store security credentials with code: don’t check security keys into your GitHub repositories, don’t store them in container images, and so on. Security credentials should be held in a secure, audited store, ideally with an interface that makes it easy to manage access to credentials and to rotate them if there’s any sort of security leak (or just on a regular schedule, if your company has a policy on that in place).
We’re going to use two AWS services to do this here. The first is the Key Management Service (KMS), which is part of the IAM (Identity and Access Management) system. We’ll use this to create and manage an encryption key we can use to encrypt our secrets. The second is the Parameter Store service provided as part of the EC2 system, which gives us a secure key-value store for secret data that’s accessible via the AWS command-line interface.
Once we have these things set up, we’ll write an IAM policy that gives permissions to use the KMS key and to read parameters from our parameter store, and we’ll attach that policy to the IAM role that runs our ECS tasks. This policy+role approach is a pretty common way to manage access to AWS services for applications running on EC2 and/or ECS.
Creating an encryption key
Let’s start by creating an encryption key:
- Go to the IAM console and select “Encyption keys” from the left sidebar.
- Select the region that you’re using for your ECS deployment in the dropdown menu at the top of the key list that you see. (This user interface is different from the way that regions are treated in most of the rest of AWS, so be careful!) Once you have the right region selected, click on the “Create key” button.
- Give a name for the key (
thumbnailer-key
) and click on “Next step”. - Click on “Next step” again to skip adding tags to the key.
- On the “Define Key Administrative Permissions”, give your personal IAM user administrative permissions by filling in the checkbox next to your username. Click on “Next step”.
- On the “Define Key Usage Permissions” page, fill in the checkbox
next to your IAM user name, but also fill in the checkbox next to
the
ecsTaskExecutionRole
IAM role. Click on “Next step”. - Finally, review the IAM policy for using the key, and click on “Finish”.
What we’ve done here is to create a key that you can manage using your IAM user account, but that is also usable by the role that runs the tasks in your ECS service. For a real production deployment, you would create an application-specific role for this purpose and use it during the ECS setup process. (This is particularly true if your application needs to access other AWS services to do its job.)
After the new key has been created, click on the key name and make a note of the ARN (Amazon Resource Name) for the new key, which you’ll need in the IAM policy we’ll create in a minute.
Set up secret parameters
Now we can create some secrets in the EC2 Parameter Store:
- Go to the EC2 console and select “Parameter Store” from the left sidebar. (It’s close to the bottom.) Make sure you’re in the right region (this time you use the main region menu, at the top right of the page).
- Click on the “Create Parameter” button.
- Fill in the parameter name (
thumbnailer.memcachier-server
), select “Secure String” for the “Type”, then choose the name of the KMS key you just created to use to encrypt the parameter. Finally copy the MemCachier cache endpoint into the “Value” field, e.g.mc1.dev.eu.ec2.memcachier.com:11211
. Click on “Create Parameter”. - Repeat steps 2 and 3 for the MemCachier username (call the
parameter
thumbnailer.memcachier-username
) and password (thumbnailer.memcachier-password
).
If you have the AWS command line tools set up, you can test that the parameters are accessible by doing something like this in a terminal:
aws ssm get-parameters --name thumbnailer.memcachier-server --with-decryption --region eu-west-1
adjusting the region to match whatever region you’re using. (This will only work if you have your AWS credential set up so that the AWS CLI can run without asking for them – see here for how to set this up.)
Make secrets accessible to ECS task execution role
We now need to set things up so that the code running in our ECS containers can access the secrets we just created. To do this, we create an IAM policy that gives permissions to access the parameters and encryption key, and then add that policy to the IAM role that runs our ECS tasks.
To create the IAM policy:
- Go to the IAM console and click on “Policies” in the left sidebar.
- Click on “Create policy.”
- Click on the “JSON” tab to edit the raw JSON representation of the policy, and paste in the policy body from below, filling in the placeholders as appropriate. Click on “Review policy”.
- If there are no errors in the policy body, you’ll end up on a page
where you can give a name to the policy (use
thumbnailer-policy
). Then click on “Create policy”.
Here is the JSON policy body to use – fill in the placeholders for
your setup (<region>
is the AWS region where your ECS setup is,
<key-arn>
is the ARN of the encryption key you created earlier, and
<account-id>
is your numeric AWS account ID, which you can see as
the 5th components of the key ARN):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [ "ssm:DescribeParameters" ],
"Resource": "*"
},
{
"Sid": "Stmt1",
"Effect": "Allow",
"Action": [ "ssm:GetParameters" ],
"Resource": [ "arn:aws:ssm:<region>:<account-id>:parameter/thumbnailer.*" ]
},
{
"Sid": "Stmt2",
"Effect": "Allow",
"Action": [ "kms:Decrypt" ],
"Resource": [ "<key-arn>" ]
}
]
}
This policy has three statements in it, giving permissions to perform
three different kinds of AWS operations. First, the
ssm:DescribeParameters
operation allows an IAM user or role to list
all of the parameters defined by the user account they’re associated
with (the *
wildcard in the Resource
value means “all values”);
second, it allows the user/role to use the ssm:GetParameters
operation to get the value of any parameters that match the pattern in
the Resource
part of the clause, i.e. any parameter that starts with
“thumbnailer.
”, which matches the parameters we defined; finally,
the policy allows a user/role to use the encryption key we used to
decrypt stored values – we need this to be able to access the
plaintext values of the parameters from their values encrypted with
our encryption key.
Now we attach the thumbnailer-policy
policy to the
ecsTaskExecutionRole
IAM role:
- Go to the IAM console and click on “Roles” in the left sidebar.
- Click on
ecsTaskExecutionRole
in the list of roles. - Under the “Permissions” tab, click on “Attach policy”.
- Select the
thumbnailer-policy
in the policy list. You can either type part of the name into the search box to filter the (long) policy list, or you can select “Customer managed” in the “Filter” dropdown to show only the policies that you’ve created yourself. - Click “Attach policy”.
The ecsTaskExecutionRole
now has the permissions defined in the
thumbnailer-policy
policy, so that the tasks running in our ECS
service will be able to access the secrets we’ve defined.
Pick up secrets in start-up script
So, at this point:
- we have secrets stored in the EC2 Parameter Store;
- the role we’re using to run our ECS tasks has sufficient permissions to be able to access the parameters;
- our Django code accesses the parameters from environment variables.
The last thing we need to do is to make the link between these, by
extracting the secret parameter values from the EC2 Parameter Store
and putting them into environment variables for Django to access.
We’ll do this by modifying our start.sh
container startup script.
(An alternative approach would be to add some Python code to our
Django application to access the secrets directly, using the
boto3
AWS client library. We’re going to use the AWS CLI to
do it inside our startup script here, just for a demonstration.)
We add the following code to the top of our startup script:
EC2_REGION=$(curl -s http://169.254.170.2/v2/metadata | grep TaskARN | cut -d: -f5)
function get_param() {
aws ssm get-parameters --names $1 --with-decryption --region $EC2_REGION | \
grep Value | cut -d: -f2- | tr -d '" ,'
}
export MEMCACHIER_SERVER=$(get_param thumbnailer.memcachier-server)
export MEMCACHIER_USERNAME=$(get_param thumbnailer.memcachier-username)
export MEMCACHIER_PASSWORD=$(get_param thumbnailer.memcachier-password)
This uses the ECS task metadata endpoint to
determine what AWS region we’re operating in, then uses the AWS
command-line interface to get parameter values. The get-parameters
sub-command is part of the ssm
(Systems Management) command, and
returns a JSON document that looks like this:
$ aws ssm get-parameters --names thumbnailer.memcachier-server --with-decryption --region eu-west-1
{
"Parameters": [
{
"Name": "thumbnailer.memcachier-server",
"Type": "SecureString",
"Value": "mc1.dev.eu.ec2.memcachier.com:11211",
"Version": 2
}
],
"InvalidParameters": []
}
The get_param
function in our script just pulls the value part out
of this and we assign the results to the environment variables we
need. To make this all work, we add the AWS command line interface to
our requirements.txt
(it’s a Python package) so that it’s included
in our container image.
One thing to note about this is that we don’t need to worry about setting up any sort of AWS credentials for running the AWS command-line interface inside our container. When you run code inside ECS (or on a manually managed EC2 instance), you have an AWS identity of some sort, either implicitly, or as in this case, because of an explicit assignment of an IAM role. The permissions that you have to perform AWS actions are derived from that identity. In the case here, all the operations that we need to achieve are permissioned via the IAM policy that we set up earlier.
Updating our deployment
Now that we’ve updated our code, we can rebuild our container image:
$ docker build -t skybluetrades/thumbnailer .
and push it to the Docker Hub:
$ docker push skybluetrades/thumbnailer
The only part of these steps that will take much time is installing the Python dependencies. The rest should be relatively quick, since most of the content of the container is already built and deployed to the Docker Hub.
We now need to deploy the new container image to ECS, and we need to
force a turnover of the tasks running on ECS to use the new image. To
do this, we go to the ECS console, find our thumbnailer-cluster
cluster and open its cluster page, then select the “Services” tab.
Select the thumbnailer-service
service and click on “Update”. This
reopens the service configuration wizard. We don’t need to make any
changes at all to the configuration. All we need to do is to check the
“Force new deployment” checkbox. Press the “Next step” buttons until
you get to the final review page of the wizard, then hit the “Update
Service” button.
Go to the service page for the thumbnailer-service
service and
you’ll see something interesting start to happen. Switch back and
forth between the “Tasks” and “Deployments” tabs, and hit the refresh
button now and then. You’ll see ECS provisioning and starting new
tasks to replace the old ones, keeping track of the fact that the old
tasks are associated with one deployment and the new tasks with a
second deployment. The way we’ve set up the minimum and maximum task
count percentages (which you can see on the “Deployments” tab), all of
the new tasks will be started before the old tasks are retired. By
modifying these percentages, you can trade off between using extra
resources during deployments and reducing service availability. In our
case, the maximum percentage is 200%, so there is no interruption of
service during the deployment.
This push-button deployment and management of task turnover is a major advantage of ECS. If you’ve ever had to deal with that scarey moment of changeover during production deployments without this kind of support, you’ll appreciate that it can remove a lot of worry from doing production deployments.
If you take a look in the CloudWatch logs, you should see that the
Django application reports that it is running with caching enabled,
since it was able to pick up the MEMCACHIER_SERVER
environment
variable that we pulled from the EC2 Parameter Store.
So, now, if everything is working, thumbnails generated by our Django code will be cached. Try creating a couple of thumbnails via the web application and look at the times reported to create them. If you repeat a request for some thumbnail size and image URL, you’ll see that the second and subsequent requests are much quicker than the initial request, because the thumbnail data can be pulled straight out of the cache, instead of downloading the original image URL and generating the thumbnail anew.
Another thing to notice if you do this is that our application is still load balanced across three different servers, but the cache is shared between all the instances, so that if one server generates a thumbnail for a new image, and a later request for the same thumbnail is routed to a different server behind the load balancer, the result comes straight out of the cache.
Benchmarking results
We can get a more concrete idea of how much of a difference caching
makes using the Node.js benchmarking code described
earlier. Running this with TEST_URL
set
to point to the load balancer DNS name for our application, we end up
with a results file (results.json
) that contains sections like this:
'https://www.memcachier.com/assets/logo.png:128':
[{server: 'ip-10-0-1-220.eu-west-1.compute.internal', tserver: 527, ttotal: 771},
{server: 'ip-10-0-1-220.eu-west-1.compute.internal', tserver: 9, ttotal: 1445},
{server: 'ip-10-0-1-49.eu-west-1.compute.internal', tserver: 62, ttotal: 294},
{server: 'ip-10-0-1-220.eu-west-1.compute.internal', tserver: 70, ttotal: 286},
{server: 'ip-10-0-0-108.eu-west-1.compute.internal', tserver: 78, ttotal: 277},
{server: 'ip-10-0-1-49.eu-west-1.compute.internal', tserver: 68, ttotal: 271},
{server: 'ip-10-0-1-49.eu-west-1.compute.internal', tserver: 15, ttotal: 208},
{server: 'ip-10-0-1-49.eu-west-1.compute.internal', tserver: 11, ttotal: 225},
{server: 'ip-10-0-0-108.eu-west-1.compute.internal', tserver: 1, ttotal: 201}]
This shows, for one image URL
(https://www.memcachier.com/assets/logo.png
) and thumbnail size
(128), the results of running a number of thumbnailing requests (the
number of requests for each URL and size is random, because of the way
the benchmarking code works). The results for the requests are shown
in the order they happened, and for each request you see the name of
the server that serviced the request, the time (in milliseconds) taken
on the server to generate the thumbnail data, and the total request
time as seen by the client (from the point of pressing the “Thumbnail
it!” button to the point where the result page has stabilised in the
browser).
There are three main things to notice:
- The requests are load balanced across three different servers;
- Thumbnail generation in the first request takes 527 ms, while in subsequent requests, it takes on average about 40 ms, because the thumbnail data is being pulled out of the cache;
- The total request times are quite variable (see especially the second request, the first that used the cached result).
The variability of the total request times is pretty typical of code that runs on cloud services: there are a great many sources of unpredictability and latency between a client and our server code, including the network connection from the client to the cloud service, and sources of latency inside AWS (due to scheduling of our code on the EC2 instances used to run our ECS tasks, network queuing and so on).
Conclusions
We’ve covered a lot of ground in this tutorial. Some things to take away from it:
- Deploying container-based applications on ECS takes a little bit of setup, but the payoff is good. We get load balancing, log aggregation, easy scalability and easy redeployments.
- Adding caching to Django applications, even when we do custom caching, is pretty easy, and using MemCachier means we don’t need to worry about managing caching infrastructure ourselves.
- There’s quite a big step between running your application in “developer” on your own machine and running it in production with resiliency, scalability, load balancing, and so on. It’s a good idea not to underestimate how much work it takes getting things production-ready!
Appendix: Cleaning up
To clean up all of the AWS resources associated with this tutorial, do the following:
- Delete the
thumbnailer-lb
load balancer, thethumbnailer-tg
load balancer target group and thethumbnailer-lb-sg
security group (from the EC2 console, follow the links in the left sidebar to the load balancers, target groups and security groups list, and select the right entities and delete them). - Delete the
thumbnailer-service
service, thethumbnailer-sg
security group, and then thethumbnailer-cluster
cluster (press the “Delete Cluster” button on the cluster page in the ECS console). This will delete the CloudFormation stack associated with the cluster, removing all EC2 instances, CloudWatch log groups, and networking infrastructure associated with the cluster. It might take a couple of attempts to get it to delete everything, because the timeout on the deletion of the CloudFormation stack doesn’t seem quite long enough in some regions. It’s also possible that deleting the CloudFormation stack may fail completely, erroneously saying that the associated VPC still has dependencies. If that happens, you can delete the VPC manually and rerun the ECS cluster deletion to clean up. - Go to the
thumbnailer-task
task definition from the ECS console and deregister the task definition revision we have there. To do this, just click on thethumbnailer-task
name, then selectthumbnailer-task:1
in the resulting list and choose “Deregister” from the “Actions” dropdown. Once you’ve done this, thethumbnailer-task
task definition will no longer appear in the main task definition list. - Delete the
/ecs/thumbnailer
log group from CloudWatch (follow the “Logs” link in the left sideback of the CloudWatch console). - Delete the secret parameters from the EC2 Parameter Store and the
thumbnailer-key
encryption key from the IAM Key Management Service.
You might also want to remove the container images you uploaded to the Docker Hub, which you can do from the Docker Hub web interface.
Appendix: Prerequisites
Docker
There are also operating system packages for Docker on many platforms, but whether they’re up to date is a bit of a lottery. Rolling release distributions (Arch Linux, for example) will be up to date (I’m using Arch and have Docker v18 installed), but if you’re using something like Ubuntu, you might be better installing following the instructions on the Docker website here. You’ll want to install what the Docker website calls the “Community Edition”, which is free.
Once you have it installed, try doing docker info
to see if it’s
working.
Benchmarking prerequisites
To run the benchmarking code, you’ll need the following:
- Firefox: available as a normal operating system package;
- NPM and Node.js: available as a normal operating system package or from https://nodejs.org/ (I’m using NPM v6.1.0 and Node.js v10.4.1);
- A relatively recent Java runtime (this is used by the Selenium browser automation system we use to drive Firefox): this should be available as a normal operating system package, but there are often lots of confusing options available. I’ve been using OpenJDK 8, which seems to work fine, and should be available on most platforms.
Appendix: AWS, the absolute basics
In a nutshell, what AWS offers is on-demand computing infrastructure. That includes compute resources, storage, networking, plus a lot of different options for managing everything. These options are divided into services. We’ll be using only a small number of those services in this tutorial. The sections below describe the most essential of these.
Elastic Compute Cloud (EC2)
EC2 is the core of AWS: it allows you to start virtual machine
instances running on Amazon hardware in data centres around the world.
AWS divides the world into regions (e.g. us-east-1
in North
Virgina, eu-central-1
in Frankfurt, ap-northeast-1
in Tokyo, and
so on), each of which has a number of availability zones (AZs)
(physically separate data centres, more or less). You can start a
virtual machine running in one of those AZs from any operating system
image you like, and once it’s up and running, it looks like any other
machine: if it’s set up right, you can SSH into it, run a web app on
it, whatever you like.
EC2 instances come in different instance types, which vary in size,
cost and specialisation, from small instances for experimentation and
throw-away use (e.g. t2.micro
, which is free for the first year
you’re on AWS) up to large instance types optimised for in-memory
databases, storage applications, GPU processing, etc. (for example, a
c5d.18xlarge
is a compute-optimised instance type with 72 CPUs and
144 Gb of memory).
You can start and manage EC2 instances yourself, but there are several AWS services to help: Elastic Beanstalk and CloudFormation are two of the more popular ones. (The EC2 service also deals with a couple of aspects of managing EC2 instances that don’t really fit into other services. In particular, load balancing is part of the EC2 service, which we’ll use a little.)
Identity and Access Management (IAM)
AWS has a complex and powerful policy-based permission system to control access to AWS resources. The IAM service manages all of this.
Once you have an AWS account, you can create IAM users with their own login credentials and restricted permissions limiting the AWS services and resources that those users are allowed to use (the original credentials you use to set up your AWS account act as “root” credentials that allow you to do anything in your account). IAM provides facilities for managing users in groups, attaching permissions policies to different users or groups, and so on.
One concept that IAM has that may not be familiar is the idea of
roles. On Unix-based systems, we tend to use normal user accounts
even for users that aren’t real people. For example, we have an lp
user that runs the printing subsystem, a docker
user that runs the
Docker daemon, and so on. On AWS, instead of using “real” users for
these “non-real” cases, we use roles. Like normal users, roles have
permissions policies attached to them, controlling what AWS services
they’re allowed to use. It’s possible to set EC2 instances up to start
in a particular role (using something called an instance profile),
which makes it easy to control what AWS services individual
applications are allowed to use.
Virtual Private Cloud (VPC)
AWS provides virtualised networking infrastructure as well as virtualised compute resources. The main component of this that we’ll use is a virtual private cloud (VPC). This is a mechanism that essentially allows you to pretend that the AWS infrastructure that you use is isolated onto its own network, even though the machines you’re using are virtualised slices of larger machines, the network addresses you use are translated into a part of a larger internal AWS network address space, and so on.
When you create a VPC, you give a range of internal IP addresses to
allocate to it (as a CIDR block, usually in the 10.0.0.0
private
address range). A VPC then has a number of subnets created in it,
each allocated a section of the VPC’s address range. When you launch
EC2 instances, you specify which subnet they live in. The idea here is
that a VPC is specific to a region, but subnets are specific to
availability zones, meaning that you can create a virtual network
that spans multiple data centre locations with very little effort:
just create a VPC in the region that you want, create a couple of
subnets in different AZs, then you can launch EC2 instances in the
different subnets and have resilience to data centre problems.
Ingress and egress of network traffic from entities in a VPC are governed by security groups, sets of rules saying what types of TCP/IP traffic are allowed in or out of, for example, an EC2 instance. You can restrict incoming traffic by port or origin IP address, for example, so you could open up ports 80 and 443 (HTTP and HTTPS) to the world, while keeping port 22 (SSH) restricted to access from known machines that you control.