How to solve some pitfalls in dockerizing

Discover Docker and containers technologies and learn how to solve pitfalls when using them.

This article goes over all the steps that led to containerization, starting from servers, passing through virtual machines up to containers. Docker architecture will be analyzed and in particular, how to solve some of the pitfalls when using this technology.

This article is the result of my experience in creating and deploying dotnet and c# applications via virtual machines and in the last year with docker. The first time I had to create an application I had to deploy it via virtual machines and it is hard to imagine how difficult and laborious it was to create the whole environment. And… did I mention it was my first experience??? Then time passed and one day at work someone told me that there was a sort of magic tool that let you deploy applications in a few steps and by just typing some commands. Obviously I thought he was kidding and making fun of me, but in the end everything was true… and since that day everything has changed!

From the stone age to virtualization

Several years ago, when you wanted to put your applications online you had to literally build your own computer and set up a dedicated web service, which was called a server. But over time, this made the software painfully difficult to move around and to update.

Then virtual machines arrived. A virtual machine (VM) is a virtualization of a physical server. It means that hardware resources can be broken down into many parts in order to run what appear to be many separate computers on hardware that is one computer. As you can imagine, all this can get very expensive in terms of hardware resources, hosting and time to install and configure the machine, not to mention the heavy usage of the operating system.

For example, imagine if you have to install WordPress locally, you have to do lots of steps before you have it and this is the reason why nowadays it’s better to use containers.

From virtualization to containerization

Containers can be considered as a sort of evolution of the virtual machines, that introduces important innovations to the traditional technique.

The concept of containers is slightly different from that of virtual machines. In fact containers share the same operating system kernel and isolate application processes from the rest of the infrastructure. This system optimizes resource usages and reduces the need for redundant operating systems.

As you can see from the picture the main virtualization component, the hypervisor, which is software that manages all the virtual machines running on the host machine, has been replaced by a new component, the container engine, that is responsible for running containers.

What is Docker?

Docker is an open platform for developing, shipping, and running applications by using containers.

Docker uses a Client-Server architecture and these are the main components:

  1. Docker: the client that sends commands and requests to the server;
  2. Dockerd: the server that manages containers and images;
  3. Registry (Docker Hub): a public registry where you can find all the open images and you can even run your private ones.

The client sends requests, typed by the user, to the server (picture 2.). The docker commands use the Docker API and the docker daemon listens for Docker API requests and it can find the requested images on the Docker Hub.

Advantages of containers

There are lots of advantages of using containers, here are the most important ones:

  • containers are exceptionally light: they are only megabytes in size and take just seconds to start;
  • simplified deployment: the deployment unit is a self-consistent and versioned image ready to be run;
  • scalability: a container running on a docker instance can be easily replicated on another one, with the same functionality and consistency;
  • isolation: Dependencies or settings within a container will not affect any configurations on your computer, or on any other containers that may be running;
  • security: the applications running on containers are completely isolated from the others, so no container can see the running processes inside another container and it can’t interfere with its operations.

Where you should use Docker

As I explained above, the use of Docker brings many advantages in the development and management of an application, but obviously Docker is not the solution to everything. In fact there are lots of situations and scenarios in which it is still preferable to keep on using virtual machines. In this paragraph we will understand better when Docker is the best choice.

First of all, if you have to deal with a microservice architecture, containers are the best host for it. Microservices are an architectural style that structures an application as a collection of services. The software systems are generally born as monolithics, single-tiered software applications, in which the user interface and data access code are combined into a single program from a single platform. But as the application grows, monoliths can become difficult to maintain and deploy. Instead, microservices break down systems into simpler functions, which can be implemented in an independent way. This is the reason why containers are the best host for microservices, because they are autonomous, easily implemented and efficient.

If you have developers working with different setups, Docker provides a suitable way to have local development environments that are similar to the production one.

In the end when you need to put your app on multiple phases of development (dev, test, production), Docker is definitely the best choice.

BUT, there’s a huge but, all that glitters is not gold.

Potential pitfalls when using Docker

Despite the fact that Docker has so many advantages, you can face some pitfalls when using it:

  • data management: if you want to preserve your data you have to make sure not to write them inside the containers, because if you remove or restart your container you’ll definitely lose your data. There are two solutions to this problem: volumes or bind mounts. The first one requires you to store your data in a specific part of the host file system, which is called docker volumes directory and it is managed by docker, while the other one requires you to store them anywhere on the file system of the host machine. So the main difference between them is the place where you have to store the data;
  • how to properly write a Dockerfile or a Docker Compose file: no one tells you what to write inside a Dockerfile or a Docker Compose file, so you have to do some web research and hopefully you’ll find what you need;
  • good knowledge of Linux Shell: you have to have a good knowledge of Linux in general but in particular of the Shell, because if you want to use docker and containers you have to interact from the command line, so you’ll need to know how to use it;
  • no UI: you don’t have a graphical interface to interact with, you can communicate with the console, which is not intuitive at all;
  • debugging: debugging is very complicated and slow, so it’s a waste of time during development.

Well, let’s say that some of these pitfalls can be solved in the context you find yourself in and in what you’re developing (see debugging point), while others, like how to write a Dockerfile, are recurring problems.

Overview

The secret lies in understanding how to build images and how to orchestrate them, but first of all we have to define some important terms.

Dockerfile

A dockerfile is a text file that defines a docker image with a simple and concise syntax. While a docker image is a read-only template with instructions for creating a docker container, which is an instance of a docker image. Each instruction of the dockerfile creates a layer on the image and when you change the dockerfile and rebuild the image, only those layers which have changed are rebuilt.

In short: you have to write a dockerfile for building an image that runs containers.

Below we have an example of Dockerfile.

FROM node:8
WORKDIR /usr/src/app
COPY package*.json
RUN npm install
COPY . .
EXPOSE 8080
CMD [“npm”, “start”]

These above are some of the main instructions you can see on a Dockerfile:

  1. FROM: this is the most important instruction that cannot be missed out, because it initializes a new build and sets the base image from where to start.
  2. WORKDIR: it creates a working app directory for other instructions like RUN, COPY and ADD, which follow it in the dockerfile.
  3. COPY: it installs app dependencies.
  4. RUN: this command generates a new layer, that brings with it the changes derived from the command itself, so it’s recommended to insert multiple related commands within a single RUN instruction. In this case we have npm as a command and install as a parameter, that is installing an application and all the required packages for the proper functioning.
  5. EXPOSE: it lets Docker know which ports the container will listen on during runtime. It’s important to know that this command doesn’t automatically open the specified ports, but it does let Docker know, during image building, that it has to make a forwarding.
  6. CMD: it takes effect in the container and it doesn’t modify the image or add new layers. This instruction sets the command to be executed when running the image, and you can add only one cmd instruction inside the Dockerfile.

And now? What should I do with these images? Well, now, if you need to run Docker applications that require both back-end and front-end, you have to use something for defining and running multi-container applications and that is Docker Compose.

Docker Compose

Docker Compose is an orchestrating tool for complex applications with minimal functionalities.

However if you want to use Docker Compose you have to:

  1. define your app environment with a dockerfile (so it can be reproduced anywhere)
  2. write a docker-compose.yml file
  3. execute a special command (docker-compose up).

Now if we take the WordPress problem that I presented at the beginning, as an example, we can easily fix it with docker compose.

Here we have the file.

services:
db:
image: mysql:5.7
volumes:
- db_data:/var/lib/mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: somewordpress
MYSQL_DATABASE: wordpress
MYSQL_USER: wordpress
MYSQL_PASSWORD: wordpress
wordpress:
depends_on:
- db
image: wordpress:latest
ports:
- "8000:80"
restart: always
environment:
WORDPRESS_DB_HOST: db:3306
WORDPRESS_DB_USER: wordpress
WORDPRESS_DB_PASSWORD: wordpress
WORDPRESS_DB_NAME: wordpress

At the beginning of the file you can find all the services you need to make WordPress work, then you have the data management section (you can see the volumes solution I explained before) and the WordPress section with all its environment variables.

And finally you just have to execute the command docker-compose up!

Conclusions

Here we are at the conclusions, I’ll finish by saying that certainly Docker is a very powerful and state-of-the-art tool, but you must know how to use it in the right way and before that how it works, because otherwise you risk not being able to take advantage of all the potential of this tool.

In summary:

  • Virtualization is not enough;
  • now containers are the standard;
  • DockerFile and Docker Compose are ok but for simple scenarios;
  • for everything else there’s the CLOUD (Kubernetes, Azure container instance, OpenWhisk, etc..).

If you find this article interesting and usefull you can share it and follow me here :)