Programing

Docker에서 영구 저장소 (예 : 데이터베이스)를 처리하는 방법

lottogame 2020. 9. 28. 07:57
반응형

Docker에서 영구 저장소 (예 : 데이터베이스)를 처리하는 방법


사람들은 Docker 컨테이너의 영구 스토리지를 어떻게 처리합니까?

저는 현재이 접근 방식을 사용하고 있습니다. 예를 들어 PostgreSQL 용 이미지를 빌드 한 다음 컨테이너를

docker run --volumes-from c0dbc34fd631 -d app_name/postgres

IMHO는 단점이 있는데, "c0dbc34fd631"컨테이너를 삭제해서는 안된다는 것입니다.

또 다른 아이디어는 컨테이너에 호스트 볼륨 "-v"를 장착하는 것입니다, 그러나, 사용자 ID 컨테이너 내에서 반드시 일치하지 않는 사용자 ID를 호스트에서 다음 사용 권한이 엉망이 될 수 있습니다.

참고 : 대신에 데이터 전용 컨테이너에 할당 한 이름 인 where --volumes-from 'cryptic_id'를 사용할 수도 있습니다 ( 예 : 허용 된 답변 참조).--volumes-from my-data-containermy-data-containerdocker run --name my-data-container ...


Docker 1.9.0 이상

볼륨 API 사용

docker volume create --name hello
docker run -d -v hello:/container/path/for/volume container_image my_command

즉, 새 볼륨을 위해 데이터 전용 컨테이너 패턴을 포기해야합니다.

실제로 볼륨 API는 데이터 컨테이너 패턴을 달성하는 더 좋은 방법 일뿐입니다.

-v volume_name:/container/fs/pathDocker 로 컨테이너를 생성하면 다음을 수행 할 수있는 명명 된 볼륨이 자동으로 생성됩니다.

  1. 목록을 통해 docker volume ls
  2. 통해 식별 docker volume inspect volume_name
  3. 일반 디렉토리로 백업
  4. --volumes-from연결을 통해 이전과 같이 백업

새로운 볼륨 API는 매달린 볼륨을 식별 할 수있는 유용한 명령을 추가합니다.

docker volume ls -f dangling=true

그런 다음 이름을 통해 제거하십시오.

docker volume rm <volume name>

@mpugach가 주석에 밑줄을 긋기 때문에 멋진 한 줄로 모든 매달려있는 볼륨을 제거 할 수 있습니다.

docker volume rm $(docker volume ls -f dangling=true -q)
# Or using 1.13.x
docker volume prune

Docker 1.8.x 이하

생산에 가장 적합한 접근 방식은 데이터 전용 컨테이너 를 사용하는 것 입니다.

데이터 전용 컨테이너는 베어 본 이미지에서 실행되며 실제로 데이터 볼륨 노출 외에는 아무것도 수행하지 않습니다.

그런 다음 다른 컨테이너를 실행하여 데이터 컨테이너 볼륨에 액세스 할 수 있습니다.

docker run --volumes-from data-container some-other-container command-to-execute
  • 여기 에서 다양한 용기를 배열하는 방법에 대한 좋은 그림을 얻을 수 있습니다.
  • 여기 에 볼륨 작동 방식에 대한 좋은 통찰력이 있습니다.

에서는 블로그 게시물 소위 잘 설명가 볼륨 패턴 용기 갖는 요점 명확히 데이터 만 용기 .

Docker 문서에는 이제 볼륨 / s 패턴 으로 컨테이너에 대한 DEFINITIVE 설명이 있습니다.

다음은 Docker 1.8.x 이하의 백업 / 복원 절차입니다.

지원:

sudo docker run --rm --volumes-from DATA -v $(pwd):/backup busybox tar cvf /backup/backup.tar /data
  • --rm : 종료시 컨테이너 제거
  • --volumes-from DATA : DATA 컨테이너가 공유하는 볼륨에 연결
  • -v $ (pwd) : / backup : bind 현재 디렉토리를 컨테이너에 마운트합니다. tar 파일을 쓰려면
  • busybox : 작고 단순한 이미지-빠른 유지 관리에 좋습니다.
  • tar cvf /backup/backup.tar / data : / data 디렉토리에있는 모든 파일의 압축되지 않은 tar 파일을 만듭니다.

복원:

# Create a new data container
$ sudo docker run -v /data -name DATA2 busybox true
# untar the backup files into the new container᾿s data volume
$ sudo docker run --rm --volumes-from DATA2 -v $(pwd):/backup busybox tar xvf /backup/backup.tar
data/
data/sven.txt
# Compare to the original container
$ sudo docker run --rm --volumes-from DATA -v `pwd`:/backup busybox ls /data
sven.txt

다음은 컨테이너와 데이터 컨테이너에 동일한 이미지를 사용하는 것이 좋은 이유를 설명 하는 훌륭한 Brian Goff 의 멋진 기사입니다 .


에서 부두 노동자 릴리스 V1.0 , A는 주어진 명령을 수행 할 수있는 호스트 컴퓨터에있는 파일이나 디렉토리의 마운트 바인딩 :

$ docker run -v /host:/container ...

위의 볼륨은 Docker를 실행하는 호스트에서 영구 저장소로 사용할 수 있습니다.


As of Docker Compose 1.6, there is now improved support for data volumes in Docker Compose. The following compose file will create a data image which will persist between restarts (or even removal) of parent containers:

Here is the blog announcement: Compose 1.6: New Compose file for defining networks and volumes

Here's an example compose file:

version: "2"

services:
  db:
    restart: on-failure:10
    image: postgres:9.4
    volumes:
      - "db-data:/var/lib/postgresql/data"
  web:
    restart: on-failure:10
    build: .
    command: gunicorn mypythonapp.wsgi:application -b :8000 --reload
    volumes:
      - .:/code
    ports:
      - "8000:8000"
    links:
      - db

volumes:
  db-data:

As far as I can understand: This will create a data volume container (db_data) which will persist between restarts.

If you run: docker volume ls you should see your volume listed:

local               mypthonapp_db-data
...

You can get some more details about the data volume:

docker volume inspect mypthonapp_db-data
[
  {
    "Name": "mypthonapp_db-data",
    "Driver": "local",
    "Mountpoint": "/mnt/sda1/var/lib/docker/volumes/mypthonapp_db-data/_data"
  }
]

Some testing:

# Start the containers
docker-compose up -d

# .. input some data into the database
docker-compose run --rm web python manage.py migrate
docker-compose run --rm web python manage.py createsuperuser
...

# Stop and remove the containers:
docker-compose stop
docker-compose rm -f

# Start it back up again
docker-compose up -d

# Verify the data is still there
...
(it is)

# Stop and remove with the -v (volumes) tag:

docker-compose stop
docker=compose rm -f -v

# Up again ..
docker-compose up -d

# Check the data is still there:
...
(it is).

Notes:

  • You can also specify various drivers in the volumes block. For example, You could specify the Flocker driver for db_data:

    volumes:
      db-data:
        driver: flocker
    
  • As they improve the integration between Docker Swarm and Docker Compose (and possibly start integrating Flocker into the Docker eco-system (I heard a rumor that Docker has bought Flocker), I think this approach should become increasingly powerful.

Disclaimer: This approach is promising, and I'm using it successfully in a development environment. I would be apprehensive to use this in production just yet!


In case it is not clear from update 5 of the selected answer, as of Docker 1.9, you can create volumes that can exist without being associated with a specific container, thus making the "data-only container" pattern obsolete.

See Data-only containers obsolete with docker 1.9.0? #17798.

I think the Docker maintainers realized the data-only container pattern was a bit of a design smell and decided to make volumes a separate entity that can exist without an associated container.


While this is still a part of Docker that needs some work, you should put the volume in the Dockerfile with the VOLUME instruction so you don't need to copy the volumes from another container.

That will make your containers less inter-dependent and you don't have to worry about the deletion of one container affecting another.


When using Docker Compose, simply attach a named volume, for example,

version: '2'
services:
  db:
    image: mysql:5.6
    volumes:
      - db_data:/var/lib/mysql:rw
    environment:
      MYSQL_ROOT_PASSWORD: root
volumes:
  db_data:

@tommasop's answer is good, and explains some of the mechanics of using data-only containers. But as someone who initially thought that data containers were silly when one could just bind mount a volume to the host (as suggested by several other answers), but now realizes that in fact data-only containers are pretty neat, I can suggest my own blog post on this topic: Why Docker Data Containers (Volumes!) are Good

See also: my answer to the question "What is the (best) way to manage permissions for Docker shared volumes?" for an example of how to use data containers to avoid problems like permissions and uid/gid mapping with the host.

To address one of the OP's original concerns: that the data container must not be deleted. Even if the data container is deleted, the data itself will not be lost as long as any container has a reference to that volume i.e. any container that mounted the volume via --volumes-from. So unless all the related containers are stopped and deleted (one could consider this the equivalent of an accidental rm -fr /) the data is safe. You can always recreate the data container by doing --volumes-from any container that has a reference to that volume.

As always, make backups though!

UPDATE: Docker now has volumes that can be managed independently of containers, which further makes this easier to manage.


There are several levels of managing persistent data, depending on your needs:

  • Store it on your host
    • Use the flag -v host-path:container-path to persist container directory data to a host directory.
    • Backups/restores happen by running a backup/restore container (such as tutumcloud/dockup) mounted to the same directory.
  • Create a data container and mount its volumes to your application container
    • Create a container that exports a data volume, use --volumes-from to mount that data into your application container.
    • Backup/restore the same as the above solution.
  • Use a Docker volume plugin that backs an external/third-party service
    • Docker volume plugins allow your datasource to come from anywhere - NFS, AWS (S3, EFS, and EBS)
    • Depending on the plugin/service, you can attach single or multiple containers to a single volume.
    • Depending on the service, backups/restores may be automated for you.
    • While this can be cumbersome to do manually, some orchestration solutions - such as Rancher - have it baked in and simple to use.
    • Convoy is the easiest solution for doing this manually.

If you want to move your volumes around you should also look at Flocker.

From the README:

Flocker is a data volume manager and multi-host Docker cluster management tool. With it you can control your data using the same tools you use for your stateless applications by harnessing the power of ZFS on Linux.

This means that you can run your databases, queues and key-value stores in Docker and move them around as easily as the rest of your application.


It depends on your scenario (this isn't really suitable for a production environment), but here is one way:

Creating a MySQL Docker Container

This gist of it is to use a directory on your host for data persistence.


I recently wrote about a potential solution and an application demonstrating the technique. I find it to be pretty efficient during development and in production. Hope it helps or sparks some ideas.

Repo: https://github.com/LevInteractive/docker-nodejs-example
Article: http://lev-interactive.com/2015/03/30/docker-load-balanced-mongodb-persistence/


I'm just using a predefined directory on the host to persist data for PostgreSQL. Also, this way it is possible to easily migrate existing PostgreSQL installations to Docker containers: https://crondev.com/persistent-postgresql-inside-docker/


My solution is to get use of the new docker cp, which is now able to copy data out from containers, not matter if it's running or not and share a host volume to the exact same location where the database application is creating its database files inside the container. This double solution works without a data-only container, straight from the original database container.

So my systemd init script is taking the job of backuping the database into an archive on the host. I placed a timestamp in the filename to never rewrite a file.

It's doing it on the ExecStartPre:

ExecStartPre=-/usr/bin/docker cp lanti-debian-mariadb:/var/lib/mysql /home/core/sql
ExecStartPre=-/bin/bash -c '/usr/bin/tar -zcvf /home/core/sql/sqlbackup_$$(date +%%Y-%%m-%%d_%%H-%%M-%%S)_ExecStartPre.tar.gz /home/core/sql/mysql --remove-files'

And it is doing the same thing on ExecStopPost too:

ExecStopPost=-/usr/bin/docker cp lanti-debian-mariadb:/var/lib/mysql /home/core/sql
ExecStopPost=-/bin/bash -c 'tar -zcvf /home/core/sql/sqlbackup_$$(date +%%Y-%%m-%%d_%%H-%%M-%%S)_ExecStopPost.tar.gz /home/core/sql/mysql --remove-files'

Plus I exposed a folder from the host as a volume to the exact same location where the database is stored:

mariadb:
  build: ./mariadb
  volumes:
    - $HOME/server/mysql/:/var/lib/mysql/:rw

It works great on my VM (I building a LEMP stack for myself): https://github.com/DJviolin/LEMP

But I just don't know if is it a "bulletproof" solution when your life depends on it actually (for example, webshop with transactions in any possible miliseconds)?

At 20 min 20 secs from this official Docker keynote video, the presenter does the same thing with the database:

Getting Started with Docker

"For the database we have a volume, so we can make sure that, as the database goes up and down, we don't loose data, when the database container stopped."


Use Persistent Volume Claim (PVC) from Kubernetes, which is a Docker container management and scheduling tool:

Persistent Volumes

The advantages of using Kubernetes for this purpose are that:

  • You can use any storage like NFS or other storage and even when the node is down, the storage need not be.
  • Moreover the data in such volumes can be configured to be retained even after the container itself is destroyed - so that it can be reclaimed, if necessary, by another container.

참고URL : https://stackoverflow.com/questions/18496940/how-to-deal-with-persistent-storage-e-g-databases-in-docker

반응형