How can I take a logical backup of large Heroku Postgres databases?

Issue

You want to capture a logical backup of a large (20GB+) Heroku Postgres database.

Resolution

Heroku PGBackups (ref: https://devcenter.heroku.com/articles/heroku-postgres-backups) are primarily intended for databases under moderate load, up to 20GB in size. If you wish to capture a logical backup for databases larger than this, or under heavy load, please consider the following suggestions.

Capture against a fork

Logical backups should be taken against a short-lived fork (ref: https://devcenter.heroku.com/articles/heroku-postgres-fork) of your Heroku Postgres database. This allows you to place the extra load produced by capturing a logical backup on the fork, without impacting your production database and followers.

Capture from a large one-off dyno

For logical backups that are failing using pg:backups, you may wish to consider using pg_dump on a detatched one-off dyno (ref: https://devcenter.heroku.com/articles/one-off-dynos) of sufficient compute power, like a Performance-L (ref: https://devcenter.heroku.com/articles/dyno-types).

You can generate a dump that is the same as pg:backups with pg_dump -F c --no-acl --no-owner --quote-all-identifiers $DATABASE_URL, and optionally use the --jobs flag to parallelise the dump operation. You can also adjust the amount of compression with the --compress flag. More can be found on this configuration at https://www.postgresql.org/docs/current/app-pgdump.html.

If the backup will take longer than the one-off dyno timeout then you will need to either ensure that the session remains active while the backup is in progress, or use a detached dyno as described in the docs.

After the dump is complete, you will need to transfer the dump to long term storage of your choice. When using one-off dynos, you will have a limit of 24 hours to complete the dump and upload.

Capture from a VPS/server

For logical backups that will take longer than 24 hours, or otherwise exceed the limitations of a one-off dyno, you may wish to consider using a VPS, or a physical server should you own one. You will need to install the relevant Postgres tools and provide sufficient access to your Heroku Postgres instance.

Limitations

As pg_dump runs using a Heroku Postgres credential when passing the connection string, information stored in the system catalog, like roles (ref: https://devcenter.heroku.com/articles/heroku-postgresql-credentials) are not included in the generated dump.

pg_dump performance suffers greatly when large objects (ref: https://www.postgresql.org/docs/current/largeobjects.html) are being used, or if there are very many database objects, normally due to a large number (50+) of schema.

Alternatives

There are also third-party solutions in the Add-ons Marketplace that can help with this process.

Ask on Stack Overflow

Engage with a community of passionate experts to get the answers you need

Ask on Stack Overflow

Heroku Support

Create a support ticket and our support experts will get back to you

Contact Heroku Support
Terms of Service Privacy Cookies © 2019 Salesforce.com