Sequences and Distributed IDs¶
You may have learned that one of the current limitations of Postgres logical replication is that sequence data is not replicated:
From Postgres documentation:
The data in serial or identity columns backed by sequences will of course be replicated as part of the table, but the sequence itself would still show the start value on the subscriber. If the subscriber is used as a read-only database, then this should typically not be a problem. If, however, some kind of switchover or failover to the subscriber database is intended, then the sequences would need to be updated to the latest values, either by copying the current data from the publisher (perhaps using pg_dump) or by determining a sufficiently high value from the tables themselves.
This may be an undesirable behavior in some scenarios. Indeed, if there was a switchover, and a replica was to continue with the insertion of new records, it'll start failing with primary key constraint violation because sequence counters ("sequence data") has not been replicated.
In this extension, you can work around this limitation by using one of the distributed (or prefixed) identifier types. The core of the idea is that every ID should contain a node identifier (prefix) and an identifier itself.
These types are named using the following pattern:
TYPE is one
of the following:
For brevity, where prefix type and identifier type are the same, the type is not repeated,
so instead of
omni_seq.id_int64_int64 we call it
How to Use¶
One can use it as a default value for a primary key, with an explicitly created sequence:
NODE_IDis either a number assigned to the current node or a unique system identifier (which can be retrieved using
generated always as identity?
The reason why we can't use
generated ... as identity syntax is that this
functionality is tied to local counters.
We also can't use generated columns at all, as "the generation expression can only use immutable functions",
nextval is volatile as it increments the sequence counter.
If you already have a table that you might need to prepare for prefixed identifiers, this guide will show how it can be done relatively easily.
Let's assume we have a table with an
integer primary key:
Now we want to add a 64-bit2 prefixed identifier, reusing the existing sequence locally.
create extension if not exists omni_seq; begin; lock table my_table; -- (1) alter table my_table alter column id drop identity if exists; create sequence my_table_id_seq; alter table my_table alter column id type omni_seq.id_int64_int32 using omni_seq.id_int64_int32_make(0, id), -- (2) alter column id set default omni_seq.id_int64_int32_nextval( omni_seq.system_identifier(), 'my_table_id_seq'); commit;
- Do the migration while locking other clients out.
0here signifies migrated rows.
When we insert into and query the table again, we'll see this:
- The actual number you will see will be different
If you already have replicas
The database schema and DDL commands are not replicated.
Therefore it is important to ensure that you perform the above before having a production setup. Otherwise, take appropriate steps to ensure the above changes are applied on all replicas in sync and due caution is exercised to ensure the upgrade is atomic.
We want to have a better answer to this. Please consider contributing your suggestions on how to handle this case.