Not sure increasing timeout is the solution. The data size could increase and invalidate the timeout we select. Also, the current load on the cluster could make the execution time unstable.
Figuring out smaller batches to delete using already mentioned strategies is the way to go.
I can certainly try that, but I wonder how it would produce less stress on the cluster if I do many small batches instead of a few large ones. Sure, RAM can run out, if the batch becomes too large, but 50000 Delete-Statements is a few MBs.
At the end of the day the amount of work to be done is the same. But yeah, I will keep it in mind to try that.
The total work will actually be bigger because you’re getting rows to the client and sending many queries compared to a single DELETE.
But the database in general is optimized for many concurrent small operations compared to few big ones. Think OLTP vs OLAP database as an example.
This can also be resumable from the client. Assuming you keep track of the “start_key”. And can also delete much faster, in case of spread dataset.
You can also have 1 thread reading and several threads to send the deletes. The delete queries would be spread over many threads on the server side and won’t interfere with each other.