How to add a column to a very large table managed by Heroku Connect with minimal downtime - variation

Issue

Heroku Connect provides the ability to add columns to a Mapping post-sync. However, if the underlying object contains a large number of rows, the operation may take a long period of time to complete. Moreover, during this time no new changes will be synced. This article explains how to create a new Heroku Connect add-on and subsequently swap it for the existing add-on (either for all or some of the mappings) with the minimal amount of downtime for the Heroku app using the Heroku Connect add-on. The procedure described below is a variation of the standard one described here and is designed, when applicable, to reduce the downtime, costs (no additional database is required), and complexity. Some of the included benefits are that the database URL won't change and that tables and data external to the Heroku Connect schema are not required to be copied on a secondary database.

Resolution

The following solution uses the same Heroku Postgres database as a target for a new Heroku Connect add-on which means that a fully configured and synced Heroku Connect add-on is ready before needing to stop the application using Heroku Connect. This means that extra care is required if you have database elements not managed by Heroku Connect. In particular, you would be responsible for copying or creating any structures at the time of the switch. Additionally, you should look out for:

  • Indexes not managed by Heroku Connect
  • Stored Procedures
  • Functions
  • Triggers
  • Views

The method breaks down into 3 sections: Requirements, Prep Work and the Cutover.

The Requirements section describes if and when the procedure can be applied based on some factors. Prep work summarises the work that can be done at any time before the planned outage, and the Cutover will include any stoppage of the application that is required to prevent any data loss.

Requirements

  • Adequate Heroku Postgres plan that can handle an increased database load and IOPS - it may depend on the number and kinds of mappings (read, read-write) that need to be modified and the number of total records to be synced.
  • Heroku Postgres plan that can provide an adequate number of connections for two Heroku Connect add-ons - higher Postgres plans can be configured with up to 1000 connections (a custom feature that needs to be validated and enabled by the Heroku engineering team).
  • Heroku Postgres plan that can provide adequate disk space.
  • Heroku Connect mapping tables are accessed without specifying the schema name (e.g. using the Postgres search_path setting).

Prep Work

For the sake of clarity, the following procedure assumes there is a Heroku Connect add-on (e.g. hc1) configured to use a specific schema (e.g. schema1) and a Heroku Postgres (e.g. pgdb). The new Heroku Connect add-on (e.g. hc2) will be configured to use the same Heroku Postgres (e.g. pgdb) but a different schema (e.g. schema2).

Moreover, it's recommended to use a different Salesforce integration user for the new add-on (hc2) to avoid Operation delayed, no available query cursors error. Also, it is recommended to use Salesforce API version >= 56 to avoid the limit of 10 open queries per connected user and increase the parallelisation of sync operations.

  1. Create a new Heroku Connect add-on (hc2) using the same Heroku Postgres (pgdb) in use by the current Heroku Connect add-on (hc1). On the command line this looks like: heroku addons:create herokuconnect
  2. Finish the Heroku Connect provision process as outlined in the Heroku Connect documentation using the newly created database from above. Note that the latest API version will be selected by default during this process.
  3. Export your existing Heroku Connect configuration from the old Heroku Connect add-on (hc1) using the Import/Export tool found in the Heroku Connect Dashboard under Settings>Import/Export Configuration. Clicking Export will generate a .json file export of your Heroku Connect configuration. Edit the .json file and remove all the mappings that are not supposed to be modified.
  4. Import the exported and edited .json file to the new Heroku Connect add-on (hc2) using the Import/Export tool found in the Heroku Connect Dashboard found under Settings>Import/Export Configuration. Clicking Import and selecting the file you downloaded in the previous step will import that configuration into the new add-on.
  5. From the new Heroku Connect add-on (hc2), from the Mapping select the "Abort" action for all the available mapping to prevent it from loading all the data from Salesforce.
  6. Edit the Mapping(s) configuration to add the columns you wish to add.
  7. Reload the Mapping(s).
  8. If you are using Merged Writes or have altered user notification settings in the old connection (hc1), you must manually do this in the new connection (hc2) under Settings>Manage Connection.
  9. Wait for the new connection to perform the initial load of all the data and get into sync with Salesforce.
  10. If you have custom indexes, stored procedures, views or functions that depend on the modified mappings tables, you should be able to re-create them into the new schema (schema2). If you have other related tables (e.g. tables referencing the mappings tables) and/or triggers, you will likely want to wait until the middle of the cutover step to bring these over.

Your new Heroku Connect add-on is able to remain in this state for an extended period of time. At this point, you should schedule an appropriate amount of downtime to perform the cutover steps below. If you have read/write mappings you'll want to do this during a time when you typically have a low write backlog. This will reduce the amount of downtime incurred.

Cutover

  1. If you have read/write mappings you should stop writing to the old mappings schema (schema1) at this time. Depending on your application's environment, this may mean stopping all dynos and putting your app into maintenance mode. Wait until the old Heroku Connect add-on (hc1) has synced all pending database changes (writes) to Salesforce.
  2. Pause both the old and the new Heroku Connect add-ons and wait for all sync operations on both connections to stop.
  3. You can also enable any custom triggers you may have on the new schema (schema2) or alter table references at this time. Also, modify the search_path to add the new schema (schema2) before the old schema (schema1) (e.g. SET search_path TO schema2, schema1, public;). By doing this, access to the new mappings tables will take precedence over the old ones.
  4. Resume the new Heroku Connect add-on (hc2). Delete the modified mappings from the old Heroku Connect add-on (hc1) and resume the synchronization.
  5. Verify that data is flowing from Salesforce and going into the correct database schema.
  6. Restart any application dynos that may have been stopped and/or disable maintenance mode on your application.
  7. If the impacted mapping is exposed via Heroku External Objects you need to set it up using the new schema (see https://devcenter.heroku.com/articles/heroku-external-objects-salesforce#setting-up-heroku-external-objects).

Ask on Stack Overflow

Engage with a community of passionate experts to get the answers you need

Ask on Stack Overflow

Heroku Support

Create a support ticket and our support experts will get back to you

Contact Heroku Support