Docker Storage

How docker store data on the local file system?

When you install docker on a system, it creates a folder structure at where /var/lib/docker The storage location of Docker images and containers.

This is where Docker stores all its data by default.

Ubuntu, Fedora, Debian : /var/lib/docker

var/lib/docker/aufs
var/lib/docker/containers
- all files related to containers.
var/lib/docker/image
- all files related to images
var/lib/docker/volumes
- Any volumes created by the docker containers are created under the volumes folder.

Windows: C:\ProgramData\DockerDesktop
MacOS: ~/Library/Containers/com.docker.docker/Data/vms/0

How does docker stored the files of an image and a container?

We need to understand Docker layered architecture.

# Specify a base image
FROM node:alpine
WORKDIR /backend

# Install some dependencies
COPY ./package.json ./
RUN npm install
COPY ./ ./
EXPOSE 4000

# Default Command
CMD ["npm", "run", "start-local"]

$ docker build -t app .
Sending build context to Docker daemon  114.2kB
Step 1/7 : FROM node:alpine
 ---> b85fc218c00b
Step 2/7 : WORKDIR /backend
 ---> Running in fa56e7dec07c
Removing intermediate container fa56e7dec07c
 ---> f9cfee4944d9
Step 3/7 : COPY ./package.json ./
 ---> 39803fa449c2
Step 4/7 : RUN npm install
 ---> Running in ac07426b77eb
npm WARN deprecated debug@4.1.1: Debug versions >=3.2.0 <3.2.7 || >=4 <4.3.1 have a low-severity ReDos regression when used in a Node.js environment. It is recommended you upgrade to 3.2.7 or 4.3.1. (<https://github.com/visionmedia/debug/issues/797>)
npm WARN deprecated fsevents@2.1.3: Please update to v 2.2.x

> nodemon@2.0.6 postinstall /backend/node_modules/nodemon
> node bin/postinstall || exit 0

Love nodemon? You can now support the project via the open collective:
 > <https://opencollective.com/nodemon/donate>

npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@~2.1.2 (node_modules/chokidar/node_modules/fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for fsevents@2.1.3: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"})
npm WARN stock-sever@1.0.0 No repository field.

added 250 packages from 203 contributors and audited 251 packages in 8.139s

15 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities

Removing intermediate container ac07426b77eb
 ---> de005edffd01
Step 5/7 : COPY ./ ./
 ---> 27f6067db542
Step 6/7 : EXPOSE 4000
 ---> Running in d409fd472fdc
Removing intermediate container d409fd472fdc
 ---> 0d8aa268e33b
Step 7/7 : CMD ["npm", "run", "start-local"]
 ---> Running in 15375e1e58d7
Removing intermediate container 15375e1e58d7
 ---> c04e3978528d
Successfully built c04e3978528d
Successfully tagged app:latest

Layer 1. First layer is a base node:alpine Operating System
Layer 2. Create Work directory
Layer 3. Copy package.json
Layer 4. Install packages
Layer 5. Copy Source code
Layer 6. EXPOSE Port 4000
Layer 7. Run the Server

What if you change source code?

# Specify a base image
FROM node:alpine
WORKDIR /backend

# Install some dependencies
COPY ./package.json ./
RUN npm install
COPY ./app2 ./
EXPOSE 4000

# Default Command
CMD ["npm", "run", "start-local"]

/app2

Layer 1. reuses the same layer
Layer 2. reuses the same layer
Layer 3. reuses the same layer
Layer 4. reuses the same layer
Layer 5. copy source code
Layer 6. reuses the same layer
Layer 7. reuses the same layer

$ docker build -t app2 .
Sending build context to Docker daemon  227.8kB
Step 1/7 : FROM node:alpine
 ---> b85fc218c00b
Step 2/7 : WORKDIR /backend
 ---> Using cache
 ---> f9cfee4944d9
Step 3/7 : COPY ./package.json ./
 ---> Using cache
 ---> 39803fa449c2
Step 4/7 : RUN npm install
 ---> Using cache
 ---> de005edffd01
Step 5/7 : COPY ./app2 ./
 ---> 30f81aea850a
Step 6/7 : EXPOSE 4000
 ---> Running in 210460154e91
Removing intermediate container 210460154e91
 ---> b1d98800e9c5
Step 7/7 : CMD ["npm", "run", "start-local"]
 ---> Running in 8f8724ee070d
Removing intermediate container 8f8724ee070d
 ---> a3eeafdda83c
Successfully built a3eeafdda83c
Successfully tagged app2:latest

Docker simply reuses all the previous layers from the cache. All of the layers are created when we run the docker build command to form the final Docker image so all of these are the Docker image layers. Once the build is complete you can not modify the contents of the layers and so they are read-only.

READ ONLY

Layer 1. The first layer is a base node:alpine Operating System
Layer 2. Create Work directory
Layer 3. Copy package.json
Layer 4. Install packages
Layer 5. Copy Source code
Layer 6. EXPOSE Port 4000
Layer 7. Run the Server

When the docker run command Docker creates a container-based of the layers and creates a new writeable layer on top of the image layer.

$ docker image ls
doREPOSITORY                           TAG                                              IMAGE ID            CREATED             SIZE
app                                  latest                                           c04e3978528d        25 minutes ago      142MB
$ docker run app
> stock-sever@1.0.0 start-local /backend
> NODE_ENV=local node index.js

Lunch app is listening on port !3001

The writeable layer is used to store data created by the container such as log files by the applications.

Read Write

Layer 8 Container Layer

The life of this container is only as long as the container is alive. When the container has destroyed the layer and all of the changes stored in it are also destroyed. If you create a file in Container Layer, the file can be read and written. The Same image layer is shared by all containers created using the image.

What if I wish to modify the source code to say test a change?

The same image layer may be shared between multiple containers created from the image. Does it mean I can not modify the source code? No

I can still modify the file but before I save the modified file Docker automatically creates a copy of the file in the read-write layer ( Container layer ). All future modifications will be done on the copy of the file in the rewrite layer. This is called copy on write mechanism.

In addition, all of the data that was stored in the container layer also gets deleted. So, what if we wish to persist the data.

If we were working with a database and we would like to preserve the data created by the container we could add a persistent volume to the container.

$ docker volume create data_volume

var/lib/docker/volumes/data_volume

When we run the docker volume create data_volume command, it creates a folder called data_volume then when you run the docker container using the docker run command you could mount the volume inside the docker containers rewrite layer using the -v option

$ docker run data_volume:/var/lib/mysql mysql

persistence volume: container default mysql stored data.

What if you did not run the docker volume create command to create the volume before the docker run command?

The docker automatically creates for us. This is called volume mounting.

What If we do not want to store data where default volumes folder?

$ docker run /data/data_volume:/var/lib/mysql mysql

Complete path to the folder. This is called bind mounting.

Two types of mounts

volume mounting
bind mounting

New ways volume and bind mounting

docker run \\
  mount type-bind,source=/data/mysql,target=/var/lib/mysql mysql

What is responsible for doing all of these operations?

Docker user storage drivers to enable layered architecture.

AUFS ( Ubuntu default storage driver)
ZFS
BTRFS
Device Mapper
Overlay
Overlay2

Docker will choose the best stories driver available automatically based on the operating system.

https://timonweb.com/docker/getting-path-and-accessing-persistent-volumes-in-docker-for-mac/.