this post was submitted on 15 Oct 2023
3 points (100.0% liked)

Programming

17313 readers
60 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]



founded 1 year ago
MODERATORS
 

I have a project in development that I'm working on and I frequently switch between two computers. I am including my sqlite file in git and so far it's been fine but I've heard in the past that git doesn't do well with binary? Has anyone actually had issues doing this?

I decided to perform a dump just in case so i dont have to start from scratch if something does go wrong.

top 8 comments
sorted by: hot top controversial new old
[–] AMDmi3 2 points 1 year ago (1 children)

Do not worry, you will not have any issues such as data loss or corruption regardless of what data you keep in git. Keeping binary files may be ineffective, as many binary formats do not play well with external compression, as a result your repository will grow by the whole size of updated file on each commit, even if the logical change is small. This should not be the case for sqlite files though, which should be well compressible both as individual blobs and between versions of the same file, so with each update your repository would grow roughly by the size of changed data. You should be careful though to close all database connections before committing so the file is in consistent state and contains all the recently written data.

[–] RockyBass 1 points 1 year ago
[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

Export the data and structure in SQL. SQL is plain text and suitable for git.

If data can be seeded easily only export the structure and git control it.

In Rails framework, schema file and seed file are used for structure and data.

[–] abhibeckert 2 points 1 year ago* (last edited 1 year ago) (1 children)

While it will "work" I honestly wouldn't recommend it.

Your .git directory is potentially going to get stupidly large (possibly large enough that some git service providers will turn you away - individual file size limits as low as 100MB are common), and one day you're likely to face a merge conflict that's really difficult to fix.

Use something else to sync your database or even better just don't sync them at all, and use migrations to keep two databases up to date. The latter is what most people do.

If "something does go wrong" though, you should be able to just restore the sqlite database from a backup... you do have backups right? RIGHT? Git is not a backup.

[–] RockyBass 2 points 1 year ago* (last edited 1 year ago)

The sqlite file in question is just for initial development testing, it's loss would be but a minor annoyance. Since i first posted this question, I've removed the binary file from git tracking anyways and just keep a plain text dump file. This is for convenience while working between two computers, not actual data backup.

[–] [email protected] 1 points 1 year ago

Git is mainly tracking and saving changes, which works great for text, but not that well for data (especially binary). You won't lose your data, but the Git repo will keep growing too fast.

The big question here is: How often does the data change? If you just use it as a convenient format and rarely change things, it should be fine. Though as mentioned: It might make sense to export to SQL before putting it in Git then. As long as the size is reasonable too (Not storing gigabytes of data).

Alternatives can be other sync services (Dropbox, Seafile, ..) to keep your Git repo lean or even better: Set up a SQL server so the data is always in the same spot. Of course that depends on if you have internet everywhere you work (but you probably do).

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

The problem is mainly that the size of your git repo blows up really quick with regularly changing binary files. Also, merge conflicts exist but I think git would just make you choose which binary to keep.

[–] scryve 1 points 1 year ago

Have you tried this? https://dvc.org/