Dumping a PostgreSQL database is more or less straight forward, loading the dump back is a bit trickier compared to that.
Dump the data
To get a dump, I usually only specify the parameter -Fc since I prefer the dumps in the custom format:
pg_dump -Fc $DATABASE_URL > db.dump
Loading it back
Loading it back into a developer's database instance is tricky compared to the dumping operation. I ended up using pg_restore in the following way by default:
pg_restore -c --if-exists -1 -x -O -d my_dev_db db.dump
The parameters do the following:
- -c --if-exists
Clean up existing items before attempting to create them.
Using --if-exists makes it a bit relaxed, so that it will not fail if a new table has been added but does not yet exist locally. This is useful when regularly refreshing.
- -1
Run inside of a single transaction.
I usually prefer this, since I'd like to know if something unexpected happens.
- -x -O
Avoids that privileges or the owner are set.
This is useful when loading a dump into a developer's database server, you would typically not want this in other scenarios.
- -d
- This just tells it into which database to load the dump.
Use case
So far I found this most useful for the case of a developer running a local PostgreSQL instance when I wanted to load a specific set of data.
In other cases there is most probably a better way to automate things. E.g. on a CI server that it is usually easier to automate in a way where first an new empty test database is created, then data is imported. In this case you would not want -c --if-exists and it may be that you want privileges and owner data to be applied as well.
One thought on loading production data
It may be tempting to distribute a production dump by default, in some cases it might even be needed.
My suggestion is to avoid this as much as possible. It will create in many cases complications since data protection concerns would have to be met. If needed, set up something to automatically scramble the personal data and try to get to a reasonable subset. Ideally, solve it by providing a minimal fixture for development which has example data for all relevant cases.
So far I only found a real need to have a production dump at hands, when something around database migrations has to be done. Here I prefer to have a test run against a copy of the database before applying this in production.
Other approaches
Got a plain text SQL dump? psql should work.
Comments
comments powered by Disqus