Function partition_hash not working as expected

I got this as a response:

Not sure increasing timeout is the solution. The data size could increase and invalidate the timeout we select. Also, the current load on the cluster could make the execution time unstable.
Figuring out smaller batches to delete using already mentioned strategies is the way to go.

What do you think?

I can certainly try that, but I wonder how it would produce less stress on the cluster if I do many small batches instead of a few large ones. Sure, RAM can run out, if the batch becomes too large, but 50000 Delete-Statements is a few MBs.

At the end of the day the amount of work to be done is the same. But yeah, I will keep it in mind to try that.

You can send several range-queries, and delete rows while iterating them. In pseudo-code:

start_key = x
while True:
    query 500 rows where start_key>x:
    delete 500 rows
    start_key = last_row[start_key]

The total work will actually be bigger because you’re getting rows to the client and sending many queries compared to a single DELETE.

But the database in general is optimized for many concurrent small operations compared to few big ones. Think OLTP vs OLAP database as an example.

This can also be resumable from the client. Assuming you keep track of the “start_key”. And can also delete much faster, in case of spread dataset.

You can also have 1 thread reading and several threads to send the deletes. The delete queries would be spread over many threads on the server side and won’t interfere with each other.