Issue
You need to rebuild a corrupted UNIQUE index, in which the UNIQUE constraint may have been violated.
Resolution
How do I know if the UNIQUE constraint on my index has been violated?
The simplest way to determine whether this has occured is to attempt to rebuild the index normally. There are two approaches:
-
For smaller indexes, or if it's possible to schedule downtime, using the
REINDEX
approach described in this article: https://help.heroku.com/RKRZV2NO/how-do-i-fix-a-corrupted-index -
For larger indexes where REINDEX might take long enough to cause downtime, it's possible to use the approach described in this article to create a replacement index
CONCURRENTLY
: https://help.heroku.com/H3LU3G7Z/how-do-i-rebuild-a-non-unique-corrupted-index
The procedure described below is only necessary if rebuilding a UNIQUE index fails due to an error about a duplicate key.
How do I rebuild a Unique Index if there are duplicate rows?
If the uniqueness constraint has been violated you will be unable to use REINDEX
. Due to this constraint, resolving corruption involves manual intervention.
The general methodology will be to create two new tables. One table, without a uniqueness constraint, to house the corrupted data. Another table to house the duplicates as they are found.
Note that the approach described below will consume at least as much additional disk space as the original table occupied, and for large tables may take some time to carry out during which there will be a lot of disk IO.
It's also important to note that writes to the existing table should be avoided during this procedure, otherwise there is a risk that records written to the original table after data is copied to the new table will be lost.
Creating your tables
Create a table to house all affected records:
CREATE TABLE <new_table> AS TABLE <affected_table>;
Create a table to store duplicates:
CREATE TABLE duplicates ( LIKE <affected_table> );
Remove Duplicates
Copy duplicates into duplicates table
INSERT INTO duplicates (SELECT * FROM <new_table> WHERE <id> IN (SELECT <id> FROM <new_table> GROUP BY (<id>) HAVING COUNT(*) > 1));
Delete Duplicates
DELETE from <new_table> WHERE <id> in (SELECT <id> FROM duplicates);
Recreate indexes on
Here you will find all of the indexes on the previous table and recreate them.
Query to find all indexes on the old table. Please be sure to replace your table name with the placehold <old-table>
SELECT tablename, indexname, indexdef
FROM
pg_indexes
WHERE
schemaname = 'public' and tablename = '<old-table>'
ORDER BY
tablename,
indexname;
You can then run the generated query in the indexdef
column to recreate the index(es) on the new table.
Example
CREATE UNIQUE INDEX <index-name> ON <schema>.<old-table> USING btree (<column>)
Would become
CREATE UNIQUE INDEX <index-name> ON <schema>.<new-table> USING btree (<column>)
** Please note that we have changed the definition to run a create against the <new-table>
as opposed to the <old-table>
If your index is a primary key index, you will need to remove the old primary key constraint and recreate it using the new index you just created.
You should do this inside of a transaction to avoid any writes being performed on the table while you are recreating the primary key:
BEGIN;
-- alter table to drop the old index
ALTER TABLE <table_name> DROP CONSTRAINT <table_name>_pkey;
-- then add the new constraint
ALTER TABLE <table_name> ADD PRIMARY KEY USING INDEX <new-index-name>;
COMMIT;