this post was submitted on 19 Jun 2023
23 points (96.0% liked)

Selfhosted

40948 readers
1227 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

Hi all

I'm running several docker containers with local persistent volumes that I would like to backup. I haven't found an easy method to do so.

What do you use / recommend to do that? AFAIK you can't just rsync the volume directory while the container is running.

top 18 comments
sorted by: hot top controversial new old
[–] [email protected] 19 points 2 years ago* (last edited 2 years ago) (4 children)

Use bind mounts instead of docker volumes. Then you just have normal directories to back up, the same as you would anything else.

In general, it's not a problem to back up files while the container is running. The exception to this is databases. To have reliable database backups, you need to stop the container (or quiesce/pause the database if it supports it) before backing up the raw database files (including SQLite).

[–] [email protected] 2 points 2 years ago

it's better to stop the service mounting those volumes before backing them up or you may break something with hot backup

[–] [email protected] 2 points 2 years ago (1 children)

Exactly the reason why i always exchange the volumes in any compose file with bind mounts.

Also you don‘t have the provlem of many dangling volumes

[–] [email protected] 1 points 2 years ago (2 children)

I don't even understand what the advantage is to using volumes rather than mounts? So I too always use mounts.

[–] [email protected] 2 points 2 years ago

I think volumes are useful when you don't want to deal with those files on the host. Mainly for development environments.

I wasn't able to understand volumes at first, and my team mate told me I had to use binders to run mysql. My project folder used to have a docker/mysql/data. Now I just point MySQL data to a volume so I don't loose data between restarts. And I don't have to deal with a mysql folder on my project with files I would never touch directly.

In my opinion, volumes are useful for development / testing environments.

[–] retrodaredevil 1 points 2 years ago

I'm not sure either. The only thing I could come up with is that with volumes you don't have to worry about file ownership. That's usually taken care of for you with volumes from what I understand.

[–] [email protected] 2 points 2 years ago

This is your answer. It also has the benefit of allowing you to have a nice folder structure for your Docker setup, where you have a folder for each service holding the corresponding compose yaml and data folder(s)

[–] [email protected] 0 points 2 years ago

docker volume is an exact same normal directory under /var/lib/docker, there's no difference with regard to backup consistency.

there's no silver bullet here, it's best to use tools specific to whatever is running in the container i.e. wal-g for postgres, etc.

[–] ruud 8 points 2 years ago

Rsync works fine for most data. (I use borgbackup) For any database data, create a dump using pg_dump or mysqldump or whatever. Then backup the dump and all other volumes but exclude the db volume.

[–] Michael717 2 points 2 years ago* (last edited 2 years ago)

In general there is no problem in rsync'ing the volume bind directory. But that depends on the application, which is running in the container. I. e. you should not copy the files of a running database. It may corrupt the data while it's being written.

[–] irreducible12302 2 points 2 years ago

I personally use a script which stops all containers, rsyncs the bind mounts (normal folders on the filesystem) and then restarts them. It runs every night so it isn't a problem that services are down for a few minutes.

Ideally, you would also make a database dump instead of just backing up the bind mounts.

[–] [email protected] 2 points 2 years ago

There is some offical documentation on this: https://docs.docker.com/storage/volumes/#back-up-restore-or-migrate-data-volumes

I personally just rsync my volumes to backblaze :)

[–] [email protected] 2 points 2 years ago

My persistent volumes are in a ZFS dataset, and I use Sanoid to periodically snapshot the dataset and Syncoid to transfer these snapshots to my backup host.

[–] easeKItMAn 1 points 2 years ago

Bind mounts are easy to maintain and backup. However if you share data amongst multiple container docker volumes are recommend especially for managing state.

Backup volumes:

docker run --rm --volumes-from dbstore -v $(pwd):/backup containername tar cvf /backup/backup.tar /dbdata

  • Launch a new container and mount the volume from the dbstore container
  • Mount a local host directory as /backup
  • Pass a command that tars the contents of the dbdata volume to a backup.tar file inside /backup directory.

docker docs - volumes

Database volume backup without stopping the service: bash into the container, dump it, and copy it out with docker cp. Run it periodically via crontab

[–] sgtgig 1 points 2 years ago

I have a script that reads all my compose files to determine each container's persistent data (though this could also be done with docker inspect) and then uses docker cp to pipe it into restic, which can use data from stdin.

docker cp mycontainer:/files - | restick backup --stdin --stdin-filename mycontainer

Stopping databases is on my todo list.

[–] [email protected] 1 points 2 years ago

Besides using bind mounts(As @[email protected]) mentions, you can run a backup container, that mounts the volume, that you would like to create a backup for. The backup container would handle backing up the volume at regular interval.

This is what I do in thedocker-compose and k3s containers I backup. I can recommend autorestic as the container for backup, but there is a lot of options.

[–] [email protected] 1 points 2 years ago

You can copy data from docker volumes to somewhere on the host node to do the backup from there. You can also have a container using the volumes and from there you can send directly to remote, or map a dorectory on the host node to copy the files to.

If you are running a database or something stateful, look at best practices for backup and then adjust to it.

Or not use volumes and map onto the host node directly. each works, and had its own advantages/disadvantages.