Hello,
We are doing optimizations in our project and try to reduce storage request flushes by batching sql updates/inserts/deletes. While doing this, we noticed that update statements are not write buffered, while inserts are.
Test case:
“PostgreSQL 11.2-YB-2024.2.3.2-b0 on x86_64-pc-linux-gnu, compiled by clang version 17.0.6 ( GitHub - yugabyte/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org. 9b881774e40024e901fc6f3d313607b071c08631), 64-bit”
Table: test_flush(tenant_id int, customer_id bigint, version int, some_value bigint). PK(tenant_id, customer_id)
Transaction isolation mode: repeatable read
Inserts that are batched and are blazing fast
DO
$do$
DECLARE
i integer;
numbers INTEGER[];
BEGIN
SELECT ARRAY(SELECT generate_series(1, 1)) INTO numbers;
FOR i IN 1..500 LOOP
INSERT INTO public.test_flush2(tenant_id, customer_id, version, some_value)
VALUES (i, i, i, i);
END LOOP;
END
$do$
Similar updates are not write buffered(first run inserts so there will be data for updates) and are slow. That is simpler case for single table
DO
$do$
DECLARE
i integer;
numbers INTEGER;
BEGIN
SELECT ARRAY(SELECT generate_series(1, 1)) INTO numbers;
FOR i IN 1..500 LOOP
UPDATE public.test_flush
SET version = version + 1, some_value = 2
WHERE tenant_id = i AND customer_id = i;
END LOOP;
END
$do$
I know I can update mutiple entries from single update statement using unnest and writes will be bufferred and overall exuction will be fast, but I’m more interested in write buffering for multiple updates for different tables. Is there way to do this?
Best regards, Timur