docker-persist-share-data

Persisting Data

Overview

Sometimes you want to share data between container or you want to persist data even docker is deleted, there there two ways for it.

  • bind mount
  • volume

Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. While bind mounts are dependent on the directory structure of the host machine, volumes are completely managed by Docker(volume is dir on host but managed by docker itself)

Volumes have several advantages over bind mounts:

  • Volumes are easier to back up or migrate than bind mounts.
  • You can manage volumes using Docker CLI commands or the Docker API.
  • Volumes work on both Linux and Windows containers.
  • Volumes can be more safely shared among multiple containers.
  • Volume drivers let you store volumes on remote hosts or cloud providers, to encrypt the contents of volumes, or to add other functionality.
  • New volumes can have their content pre-populated by a container.

volume and bind mount

Ignore tmpfs(in memory), it's for non-persisting data, data will be disappear if docker is stop or restart

Note: volume dir is at /var/lib/docker/volumes/, controlled by docker cli, so volume is independent of container, even you delete container, volume is still there if not delete explicitly by cli

docker option for persisting data

There are two options you can use to do this, one is -v(–volume ), the other is –mount, new user should use –mount

  • –mount: Consists of multiple key-value pairs, separated by commas and each consisting of a <key>=<value> tuple. The –mount syntax is more verbose than -v or –volume, but the order of the keys is not significant, and the value of the flag is easier to understand.

    • The type of the mount, which can be bind, volume, or tmpfs. This topic discusses volumes, so the type is always volume.
    • The source of the mount. For named volumes, this is the name of the volume. For anonymous volumes, this field is omitted. May be specified as source or src.
    • The destination takes as its value the path where the file or directory is mounted in the container. May be specified as destination, dst, or target.
    • The readonly option, if present, causes the bind mount to be mounted into the container as read-only.
    • The volume-opt option, which can be specified more than once, takes a key-value pair consisting of the option name and its value.

–mount needs docker version >=17.06, check it $docker version

bind mount

1
2
3
4
5
6
# For type=bind, must specify the source(from host)
$ docker run -d \
-it \
--name devtest \
--mount type=bind,source="$(pwd)"/target,target=/app \
nginx:latest

volume

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
$ docker volume create my-vol
$ docker volume ls
$ docker volume inspect my-vol
[
{
"CreatedAt": "2023-09-26T11:25:18+08:00",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/my-vol/_data",
"Name": "my-vol",
"Options": {},
"Scope": "local"
}
]

# use an existing volume so that two container can share data by volume id
# Container 1
$ docker run -d \
--name devtest1 \
--mount src=myvol,dst=/app \
nginx:latest

# Container 2
$ docker run -d \
--name devtest2 \
--mount source=myvol,target=/app \
nginx:latest

# OR

# use non-existing volume(without source)
# in this case docker will create a volume automatically
# but this volume will not deleted automatically when container is deleted

# Container 1
$ docker run -d \
--name devtest1 \
--mount target=/app \
nginx:latest

# Container 2 use volumes same as Container 1
$ docker run -d \
--name devtest2 \
--volumes-from devtest1 \
nginx:latest

# ----------------------------
$ docker inspect devtest1

$ docker stop devtest1
$ docker rm devtest1

# need to remove volume explicitly!!!
$ docker volume rm my-vol
#

volume should be deleted manually even if it’s created automatically sometime

mount block device into container

1
2
3
4
5
6
7
8
9
10
11
12
# create a raw disk
$ dd if=/dev/zero of=/tmp/loop.raw bs=1M count=100
# setup raw disk as a block device(/dev/loop0)
$ losetup /dev/loop0 /tmp/loop.raw

# create disk and format with fs
$ pvcreate /dev/loop0
$ vgcreate vg1 /dev/loop0
$ lvcreate --size 90M --name lv1 vg1
$ mkfs.xfs /dev/vg1/lv1

$ docker run --rm -it --mount='type=volume,dst=/opt,volume-driver=local,volume-opt=type=xfs,volume-opt=device=/dev/vg1/lv1' centos bash

tmpfs

tmpfs mounts only for linux. When you create a container with a tmpfs mount, the container can create files outside the container’s writable layer

This is useful to `temporarily store` sensitive files that you don’t want to persist in either the host or the container writable layer.

Unlike volumes and bind mounts, you can’t share tmpfs mounts between containers

1
2
3
4
5
$ docker run -d \
-it \
--name tmptest \
--mount type=tmpfs,destination=/app \
nginx:latest

Volume size(mount) and Container size(RW Layer)

There is no limitation for volume when creating it, you can’t set limit for it, the size of volume is determined by the disk(block) of host it resides. same thing for tmpfs as well, its size is determined by host tmpfs system(default it’s size half of total memory). volume, bind mount, tmpfs, block device appear same in the container(dst directory), that means inside a container, you see a directory, the size or data that can be stored at that directory depends on the size of block on host. but if you use fdisk, lsblk inside the container, you see the host block size, not the size can be used by the container.

The storage size can be used by container has two parts.

  • volume size determined by host block size, check volume source docker inspect, then check which block as source
  • writable layer determined by host block where the writable layer exists(default /var/lib/docker, check which block for /var)

REF