I conducted an experiment. I have two sessions open.
- In the first session, I create a table having a NOT NULL column: ‘email varchar NOT NULL’
- In the same session, I start a transaction and make an insert to the table
- In the second session, I do ALTER TABLE t ALTER COLUMN email DROP NOT NULL
- After that, in the first session I try to commit and get the following result:
ERROR: Heartbeat: Transaction f92a4ddd-fad5-41cc-9618-89be333add04 expired or aborted by a conflict: 40001
CONTEXT: Catalog Version Mismatch: A DDL occurred while processing this query. Try again.
This effectively aborts the transaction.
I understand that a schema change that touches a table enlisted in a transaction might have caused troubles if the transaction was allowed to proceed (as the transaction would start on old schema version and commit on the new one, which is weird). But it seems that this particular schema change (dropping a constraint) should not cause any problems at all (the tx might have still been running on the ‘old’ schema with stricter restrictions, so any tuple it writes would still be valid under the new, milder restrictions; and the format of the data it writes remains unchanged).
I wonder would it make sense to white-list such ‘can’t-break-anything’ schema changes and allow the transactions started on the old schema to proceed (and commit)? Or would this cause troubles due to some subtle effects?
Is there any reason why you need this? Can you explain the use-case? I assume you’re doing DDL very rarely, so this problem will happen very rarely.
What would you expect if the transaction tries to insert a null value between steps 3. and 4. ? Allow it according to the new catalog? Or reject it according to a stale catalog?
PostgreSQL would reject it, but it is consistent because the ALTER TABLE cannot complete before. It waits on an exclusive lock. In a database where the primary goal is Scale and HA we don’t want to acquire exclusive locks for DDL and use this optimistic approach with catalog version. But allowing a transaction to continue on a stale catalog may bring a non-postgres compatible behavior
Hi Dorian, thanks for answering my question.
I was thinking about ‘what would it cost to drop a constrain under load’: whether it would require any special handling or not. It’s not a big deal (there are retries), but then I was puzzled whether it would be possible to avoid such rollbacks.
Hi Franck, thanks for answering.
I would expect the transaction to still run with the constraint in place until it finishes. This seems a bit weird, that’s why I wanted to investigate the weirdness. Thanks for your explanation.
Can you show an exact example?There is a difference between dropping primary-key, check constraint, unique constraint, etc.
Check constraint is very fast, should be no overhead. Unique-constraint a little bit heavier, but still, just removing files on disk (since it’s backed by a unique index).
Dropping primary-key is very heavy (full table rewrite) and should not be done (this is because tables are stored index-organized on disk based on the primary key).