Multiple concurrent upserts: Is YugabyteDB suitable for my use case?

We use YugabyteDB (YSQL) as a sink for a high-throughput event stream. Each event describes exactly one entity.
YugabyteDB stores one summary row per entity and we continuously merge incoming events into it. The same entity can get many events in a short burst, and those events are processed concurrently by multiple workers, so we very often have several concurrent upserts hitting the same row.

Simplified schema:

CREATE TABLE device_state (
    device_id     TEXT PRIMARY KEY,
    first_seen_at TIMESTAMPTZ      NOT NULL,
    last_seen_at  TIMESTAMPTZ      NOT NULL,
    max_temp      DOUBLE PRECISION NOT NULL,
    updated_at    TIMESTAMPTZ      NOT NULL
);

Each incoming reading is an upsert: Insert if new, otherwise merge based on some logic.

INSERT INTO device_state (device_id, first_seen_at, last_seen_at, max_temp, updated_at)
VALUES ($1, $2, $3, $4, $5)
ON CONFLICT (device_id) DO UPDATE SET
    first_seen_at = LEAST(device_state.first_seen_at, EXCLUDED.first_seen_at),
    last_seen_at  = GREATEST(device_state.last_seen_at, EXCLUDED.last_seen_at),
    max_temp      = GREATEST(device_state.max_temp, EXCLUDED.max_temp),
    updated_at    = EXCLUDED.updated_at;

(The actual ON CONFLICT logic has a few more fields and is a bit more complicated.)

The problem: Under bursts, many of these run concurrently against the same device_id. We get a high rate of SQLSTATE 40001 (serialization failures), exhaust the internal retry limit (yb_max_query_layer_retries=60) and the upsert ultimately fails.

Setup:

  • YSQL, default isolation level, RF=3, YugabyteDB 2025.2.
  • Generally hundreds of events per second.
    The rate of “problematic events” (that is, a burst of events targeting the same rows) is about 5-10 events per second, continuing at this rate for a few minutes.
  • Every worker can receive any device_id (that is, there’s no partitioning). I haven’t tried batching them in the application logic before the upsert because that feels more like a band-aid.
    We’d prefer a database-level solution and are treating app-side per-key routing/coalescing as a fallback, partly out of concern about hot-key skew and future scaling.

Things I tried include:

  • Both READ COMMITTED and REPEATABLE READ isolation levels.
  • Advisory locks (this turned out to be the slowest of my experiments).
  • Calling UPDATE instead of INSERT... ON CONFLICT when I know for sure that the row for a specific entity already exists.

Questions:

  1. Is a high-contention, single-row upsert workload like this a reasonable fit for YugabyteDB, or are we working against the design?
  2. What’s the recommended way to avoid the upsert failures here?
  3. Is there any server-side configuration that meaningfully helps with hot-key write contention?

Thank you very much.

Hi @ygc

Can you partition this way by the unique row that you end up upserting in PostgreSQL and then try batching in the app?

I think this will help because you will lessen the db work if you do per-key + batch at the same time.

It is not a good fit for the replication semantics that we apply (synchronous replication). It gets worse with multi region for example because of the added latency.

Don’t do as many concurrent writes to the same keys. Either batch them in the app or your queue layer or change the model so you don’t have to do upserts (like incremental aggregations over time

Please explain in full?

Things that can help:

  1. Enable read committed and confirm you’re using it Read Committed isolation level | YugabyteDB Docs
  2. Enable wait queues Concurrency control | YugabyteDB Docs
  3. Keep the transaction in AUTOCOMMIT mode and don’t do RETURNING
  4. Get the exact error message it’s sending on the 40001 error code?