Docker Deep Dive, Feb 2018, Nigel Poulton
Application IsolationTop Bottom
Docker is a containerisation platform that allows you to develop, run and deploy applications in an environment isolated from the host environment.
A Docker image contains the binaries and libraries required to run an application and once built, the image does not change.
A Docker container is a running instance of a Docker image. Docker containers rely on the host for most of its kernel services, so linux containers require linux kernel services from the host and windows containers require windows kernel services.
The Docker daemon implements the Docker REST API for interaction with the Docker CLI and additionally provides image management and builds, security, networking and orchestration. Container execution is currently handled by containerd and creation of containers is handled by runc - the container runtime.
When you execute a 'container run' command, the Docker CLI converts this into the correct REST command and sends this to the the Docker daemon API. Docker daemon sends the command to containerd, which in turn creates a valid OCI (Open Container Initiative) bundle which it passes to runc. runc creates the container and then exits leaving a container-shim process as the parent for the running container. Because the container runtime is separate from the Docker daemon, containers can be described as daemonless containers and should be able to run independantly of the Docker daemon.
The modular design of Docker allows for components to be used independantly in other projects or to be replaced by third party products.
The Docker CLI is used to manage Docker components such as images, containers, networks and volumes, and can be configured to connect to a Docker daemon running on a remote host.
Full documentation for Docker can be found at Docker Documentation
To see information about your current docker installation use:
docker info docker version
Hello WorldTop Bottom
To test your installation of Docker you can try:
docker run docker.io/library/hello-world
The hello-world image is stored on Docker Hub, and downloaded when you first try to run the image.
Docker Hub is a Docker Registry, which is used for storing Docker Repositories. Each Docker Repository can store multiple copies of a Docker image file. The Docker images in each repository all have the same name, and are distinguished by their associated tag which is used to denote the version of the image. Official images can be pulled from Docker Hub without specifying the URL (docker.io) or namespace (library) for the repository. So the hello-world image could be pulled using only:
docker pull hello-world
To pull an unofficial image, you will need to supply a namespace, which is linked to the username of the account that owns the image repository. If the repository is not stored on the Docker Hub registry, then you will need to provide the registry URL also.
Docker images are built from instructions contained in a Dockerfile. Dockerfiles contain the blueprint for the application that you are building. The instructions in a Dockerfile are specified in uppercase by convention. A Dockerfile for creating an image for Catalyst applications might look like:
FROM perl:latest RUN cpanm Task::Catalyst && rm -rf ~/.cpanm
The most common instructions are:
- Used to specify a base image for this image to be built on top of
- Used to execute a command inside the base image
- Used to set the working directory for subsequent commands
- Copies a file or folder into the base image. Relative path names are supported
- Assigns metadata to the image using key="value" pairs
- Used to specify a default command to run when the image is started
- Used to expose a folder as a volume for use by other containers.
'VOLUME ["/app/html"]' exposes '/app/html' as a volume that
can be loaded to the same path in other containers by using:
docker container run ... -volumes_from container_name
- used to execute a script when a container starts. Entry point scripts do not get added to the image - but are only executed on container start. The ENTRYPOINT script can be tailored for each container by supplying parameters to the script
To implement your own version of hello-world, you can create the following Dockerfile:
FROM alpine:latest CMD echo 'Hello World'
To build the image run the 'docker image build' command. Use the -t flag to tag the image. The final parameter to the 'docker image build' command should be the path or url to the Dockerfile. If you are running the command from the directory containing the Dockerfile, then the path can be specified as '.'
Each command in the Dockerfile creates a layer in the image or adds metadata to the image. Use 'docker image history' to see how the instuctions in a Dockerfile were implemented in the build. Ouput rows with a size of 0KB are instructions that added metadata to the image.
When re-building an image, Docker reads the file from beginning to end, but only builds from the first changed layer.
To build the above 'hello world' Dockerfile execute the following command from the directory containing the Dockerfile:
docker image build -t myHello:1.0 .
The 'docker image' command contains functions for managing images. Use 'docker image --help' for a summary.
Usage: docker image COMMAND Manage images Commands: build Build an image from a Dockerfile history Show the history of an image import Import the contents from a tarball to create a filesystem image inspect Display detailed information on one or more images load Load an image from a tar archive or STDIN ls List images prune Remove unused images pull Pull an image or a repository from a registry push Push an image or a repository to a registry rm Remove one or more images save Save one or more images to a tar archive (streamed to STDOUT by default) tag Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE Run 'docker image COMMAND --help' for more information on a command.
To get detailed information about an image run 'docker image inspect imagename'
Use 'docker image ls' to see all current images. To remove dangling images use 'docker image prune'. Dangling images are images that remain after an images has been rebuilt using the same tag - the original image will have its tags removed when the build creates a new image with the same tags. The 'docker image prune -a' will remove all images that are not currently in use by a running container.
The following steps can be used to login to Docker Hub, retag a local image and push this to your Docker Hub account:
docker login docker image tag myHello:1.0 username/myHello:latest docker image rm myHello:1.0 docker push username/myHello:latest
You can then retrieve this image on another computer using:
docker image pull username/myHello
Image builds use the Docker cache to determine if a layer already exists for the instruction. If the instruction results in a cache hit, then the layer is copied from the cache rather than built afresh. Once a cache miss is encountered, docker builds the layer and all subsequent layers. COPY and ADD instructions, include checksums for the files being copied, and therefore changes in the source directory, will also result in a cache miss at build time. To reduce build times and redundant layers, try to keep instructions that will result in a cache miss towards the end of the Dockerfile. Provide a '--no_cache=true' parameter to the build command if you want to skip using the cache altogether.Top Bottom
The PATH specified in a 'docker image build' command is processed recursively. If you use a '.' for the path, then all the contents of the directory are passed to the Docker daemon during the build. Use a 'dockerignore' file to exclude directories or the contents of directories during the build:
#pass the directories but don't pass the current contents lib/* data/* log/*Top Bottom
The ARG instruction can be used to specify variables that are supplied at build time. For instance, the following Dockerfile sets a $proxy_server variable which is subsequently used to set some environment variables in the image:
FROM perl:5.10 ARG proxy_server ENV http_proxy=$proxy_server ENV ftp_proxy=$proxy_server ENV https_proxy=$proxy_server cpanm SOME::Exotic::Module
When the build is run the value of 'proxy_server' has to be passed using the '--build-args' option:
docker image build --build-arg proxy_server=http://my.proxy.com:3128 -t image_tag:1.0 . docker image build --build-arg proxy_server=$http_proxy -t image_tag:1.0 .
The second version of the command passes the host environment variable to the list of build args
The '--build-arg' option can also be passed to docker-compose build or you can specify the values using the 'args' option in your docker-compose.yml:
version: "3.1" services: app: build: context: './app' args: proxy_server: $http_proxyTop Bottom
The 'ENV' instructions sets variables for both:
- subsequent instructions in the Dockerfile and
- setting environment variables for containers run from the resultant image
If you just need to set a variable on a single command use 'RUN'. For example:
RUN CATALYST_DEBUG=1 catalyst.pl
The ENV instruction has two forms:
ENV fullname Roger Rabbit ENV fullname="Roger Rabbit"
The second form requires double-quotes to preserve whitespace in the valueTop Bottom
The 'COPY' instruction copies files from a source and adds them to the image at the path specified. 'COPY supports an optional '--chown' parameter. Multiple sources can be provided in a single command. Where files or directories are specified in the source list, these are interpreted as relative to the build context. Destinations can be specified with an absolute path or a relative path which is interpreted as relative to the current WORKDIR. All new files are created with a UID and GID of 0 unless the optional '--chown' parameter is specified. Where the source is a directory, the contents of the directory are copied not the directory itself. If the destination does not exist it will be created in the image.
COPY app /appTop Bottom
The 'CMD' instruction is used to supply a default command to run when a container starts. You can override this command by supplying an alternate command at the end of a 'docker run image' command.Top Bottom
Every Dockerfile should contain at least one 'ENTRYPOINT' or 'CMD' instruction. ENTRYPOINTS can be specified in one of two formats:
- ENTRYPOINT ["executable", "param1", ..., "paramN"]
- ENTRYPOINT command param1, ..., paramN
The first form is referred to as the 'exec' form and is the preferred form. The second form is the 'shell' form, and involves passing the command to 'sh -c'. The downside of the shell form, is that your ENTRYPOINT does not run as PID 1, and therefore will not receive signals from the Docker daemon. For the 'shell' form for ENTRYPOINT, CMD and 'docker run' options are also ignored.
Both ENTRYPOINT and CMD can be specified in a Dockerfile. When using the 'exec' format for ENTRYPOINT and CMD, the command-line options to 'docker container run' will append to the ENTRYPOINT options and override the CMD options. Thus, ENTRYPOINT can be used to set the options that should be fixed for the container, and CMD for the options that you may wish to override at runtime:
ENTRYPOINT ["perl", "script/catalogue_server.pl", "-r"] CMD ["-p 3000"]
The above combination will allow to specify a different option for '-p' in the 'docker run' command:
docker container run catalogue:1.0 -p 3001Top Bottom
The 'USER' sets the username and otionally the user group to use when running the image, and for any subsequent RUN, ENTRYPOINT and CMD instructionsTop Bottom
The 'ADD' instruction behaves the same as 'COPY' but includes two extra features:
- The source can be a URL
- If the sorce is a file in a recognised compression format, the file is extracted to the destination
To avoid surprises, unless you want to add a URL source or extract a file, prefer COPY over ADD
Docker containers are running instances of Docker images.
The 'docker container' commmand contains functions for managing containers. Use 'docker container --help' for a summary.
Usage: docker container COMMAND Manage containers Commands: attach Attach local standard input, output, and error streams to a running container commit Create a new image from a container's changes cp Copy files/folders between a container and the local filesystem create Create a new container diff Inspect changes to files or directories on a container's filesystem exec Run a command in a running container export Export a container's filesystem as a tar archive inspect Display detailed information on one or more containers kill Kill one or more running containers logs Fetch the logs of a container ls List containers pause Pause all processes within one or more containers port List port mappings or a specific mapping for the container prune Remove all stopped containers rename Rename a container restart Restart one or more containers rm Remove one or more containers run Run a command in a new container start Start one or more stopped containers stats Display a live stream of container(s) resource usage statistics stop Stop one or more running containers top Display the running processes of a container unpause Unpause all processes within one or more containers update Update configuration of one or more containers wait Block until one or more containers stop, then print their exit codes Run 'docker container COMMAND --help' for more information on a command.
To run a container of the local alpine image and launch a shell, use:
docker container run -it --rm alpine sh
This command runs a container using the alpine:latest image and connects your terminal to a shell running inside the container. Running 'ps' inside the container will confirm that 'sh' is the only running process and has a PID of 1. Typing exit will terminate the shell and stop the container. Alternatively, typing CTRL+PQ will disconnect your terminal from the container, but leave the container in a running in the background. Both the stopped container and the background container will still be using resources on the Docker host and this means that data in the container persists between restarts.You can restart a stopped container using:
docker container start container_name_or_id
You can reconnect to a re-started or background container using a 'docker container exec -it container_name_or_id command'.
The container's data and state is lost only when the container is deleted. The '--rm' flag on the 'docker run' command, deletes the container when the container stops. To delete a disconnected container use:
docker container stop name_of_container docker container rm name_of_container
To delete a stopped container, just use the 'docker container rm' command above
The '-d' option allows you to run the container in the background. To see the output from the container use:
docker container logs container_name
Add the '-f' flag to logs to follow the log as it changes
The '-e' flag allows you to set environment variables that might be required by your application.
docker container run -it -e CATALYST_DEBUG=0 --rm catalogue:1.0 prove -l
The '-v' flag allows you to mount a volume from the host to the container. Volumes are essential when using docker to develop code or persist data. By using volumes to mount your code into the container, you can change the code on the local file system and this is picked up by the running container immeadiately. The volume mount replaces the contents of the mount point in the container when it was built with the contents of the folder that is mounted.
docker container run -it -p 3000:3000 -d --rm --name catalogue_1 \ -v /home/user/Catalogue:/usr/src/Catalogue catalogue:1.0
Volumes allow you to persist data even after a container has been deleted.
You can use 'docker container inspect container_name' to inspect a running container to see the parameters used to start the container
You can connect to a running container using 'docker container exec':
docker container exec -it container_name bash
The above example assumes your container has 'bash' installed. The 'docker container exec' command allows you to specify commands to run on a container from the host and therefore is not limited to connecting to the container. The commands are executed as the root user on the container. You can specify values for the user parameter to run the command as another user.
docker container exec container_name perl -V docker container exec container_name prove -l docker container exec --user "$(id -u):$(id -g)" \ container_name mysqldump -u root -p catalogue_db > db_backup.sql
The last command is run as the user from the host system executing the command. Therefore the db_backup.sql file is created as that user. If the container is running with the working directory mounted to a local directory on the host system, then the file will appear on the host system with the current user and group as owner. Without the '--user' flag, the file will be created with the default docker user and group as owner - which is root.
The 'container run' command includes a restart switch which allows you to specify a restart policy for the container. The options are:
The 'always' and 'unless-stopped' are almost the same, except that stopped container will not be restarted when the Docker daemon is restarted if the 'unless-stopped' policy was specified. The 'on-failure' policy restarts the container if it exits with a non-zero exit code.
The 'docker network' command can be used to control network settings used by docker. By default the docker service provides three networks, and the bridge network is the default network used for containers. The bridge network corresponds to the docker0 interface listed in the output from ifconfig. When you add your own network, docker automatically setups DNS to allow containers to address each other by container names.
docker network ls docker network create --driver bridge mynetwork
The 'bridge' driver is used when configuring networks for containers running on the same host. To create a network to link containers running on different hosts you will need to use the 'overlay' driver.
You can use 'docker network inspect network_name' to inspect a specific network. The output from 'inspect' will show which containers, if any, are running on the specified network.
To specify a network for your containers, add the 'net' parameter to the container run command and specify a name for your container using the 'name' parameter
docker container run -itd -p 3000:3000 --rm --name catalogue \ -v /home/user/Catalogue:/usr/src/Catalogue --net mynetwork catalogue:1.0
With multiple servers running on the same custom docker network you can execute ping to test that one container can address another by name:
docker container exec -itd catalogue ping my_database
Data PersistenceTop Bottom
Contrary to popular legends, container data can persist between restarts. If you run a container, add some data then stop the container, you can then restart the same container and see the data. The data is not stored in the image, so if instead of re-connecting to the stopped container, you start a new container, you will not see the data in the new container.
# Start a new container and add some data docker container run -it --name env_test ubuntu:18.04 bash root@d65ecf013b89:/# echo $PATH > env_test root@d65ecf013b89:/# cat env_test /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin root@d65ecf013b89:/# exit # Restart the stopped container docker container start env_test # Reconnect to the container and check the file created earlier docker container exec -it env_test bash root@d65ecf013b89:/# cat env_test /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin root@d65ecf013b89:/#
However, the data will be lost if you remove the container. There are several ways this could happen:
- Running the container with the '--rm' flag set
- Running the 'docker container rm container_name' command
- Running the 'docker system prune' command
To avoid accidental loss of data, use volumes to preserve important data
The 'v' parameter for the 'container run' allows you specify local folders to load into a container. The volume will be mounted read/write by default. We have used this option to mount our application source code to the container from the host file system. This allows us to develop the application without have to constantly rebuild the image.
Docker also provides a 'volume' management command that allows you to configure named volumes that will be managed by docker. Named volumes are used to save the state of a container between restarts. This is particularly useful for configuring containers running databases.
docker volume create catalogue_mysql docker volume ls docker volume inspect catalogue_mysql
By default the volumes will be stored in '/var/lib/docker/volumes/'. You can then map this volume to the data directory for the application that your are running in a container. MySQL uses /var/lib/mysql for its data directory. Therefore to use the catalogue_mysql volume to persist data for a mysql container, execute:
docker container run --rm --name catalogue_db \ -e MYSQL_ROOT_PASSWORD=my-secret-pw \ -v catalogue_mysql:/var/lib/mysql mysql:latest
If you wish to restore a database dump to the container, prior to adding new data you must first connect your local database dumps directory:
docker container run --rm --name catalogue_db \ -e MYSQL_ROOT_PASSWORD=my-secret-pw \ -e MYSQL_DATABASE=catalogue_test \ -v $PWD/sqldumps:/docker-entrypoint-initdb.d \ -v catalogue_mysql:/var/lib/mysql mysql:5.7
Note that we mount the host directory '$PWD/sqldumps' to the container directory '/docker-entrypoint-initdb.d'. Any '.sql' files found in this directory will be imported to the database specified in the environment variable MYSQL_DATABASE and can therefore be used to resore a database dump to the container.
System MaintenanceTop Bottom
The 'docker system' command can be used to get information about your current docker installation:
Usage: docker system COMMAND Manage Docker Commands: df Show docker disk usage events Get real time events from the server info Display system-wide information prune Remove unused data Run 'docker system COMMAND --help' for more information on a command.
Use 'docker system df' to get information on disk usage. The output will also show how much disk space can be reclaimed. Add the '-v' flag for more verbose output. 'docker system info' gives information about the current installation environment. 'docker system prune' can be used to remove dangling images, stopped containers, unused networks and dangling build caches. You can add the '-f' flag if you want to run this command as a cron job.
Use 'docker image ls' to see all dangling images. Dangling images are images that have been replaced by subsequent build and have their image name and tag set as <none>
The command 'docker container ls -a -q' returns the names for all containers. You can use this command to stop all containers:
docker container stop $(docker container ls -a -q)