I am running a three node YugabyteDB cluster on K8s and ingest a lot of data into a table. This goes well up to about 1.3 million records, but then my application crashes (i. e. K8s restarts the pod, probably because it was overwhelmed). In the log file I find this:
2023-01-18 22:06:14.261 INFO [XNIO-1 task-4] Records: 1330000
2023-01-18 22:06:14.612 INFO [XNIO-1 task-4] Inserting 10000 records...
2023-01-18 22:06:17.162 INFO [XNIO-1 task-4] Inserted 10000 records in 2550ms
2023-01-18 22:06:17.162 INFO [XNIO-1 task-4] Records: 1340000
2023-01-18 22:06:17.370 INFO [XNIO-1 task-4] Inserting 10000 records...
2023-01-18 22:06:20.342 INFO [XNIO-1 task-4] Inserted 10000 records in 2972ms
2023-01-18 22:06:20.342 INFO [XNIO-1 task-4] Records: 1350000
2023-01-18 22:06:20.514 INFO [ionShutdownHook] stopping server: Undertow - 2.2.20.Final
2023-01-18 22:06:20.517 INFO [ionShutdownHook] Destroying Spring FrameworkServlet 'dispatcherServlet'
2023-01-18 22:06:20.526 ERROR [XNIO-1 I/O-2] XNIO001007: A channel event listener threw an exception
java.util.concurrent.RejectedExecutionException: XNIO007007: Thread is terminating
at org.xnio.nio.WorkerThread.execute(WorkerThread.java:620) ~[xnio-nio-3.8.7.Final.jar!/:3.8.7.Final]
at io.undertow.websockets.jsr.UndertowSession$3.handleEvent(UndertowSession.java:396) ~[undertow-websockets-jsr-2.2.20.Final.jar!/:2.2.20.Final]
at io.undertow.websockets.jsr.UndertowSession$3.handleEvent(UndertowSession.java:388) ~[undertow-websockets-jsr-2.2.20.Final.jar!/:2.2.20.Final]
[…]
2023-01-18 22:06:20.574 INFO [XNIO-1 task-4] Inserting 10000 records...
[XNIO-1 task-4] ERROR com.yugabyte.oss.driver.internal.core.loadbalancing.PartitionAwarePolicy - hash key encoding failed
java.nio.channels.ClosedByInterruptException
at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:199)
at java.base/java.nio.channels.Channels$WritableByteChannelImpl.write(Channels.java:465)
at com.yugabyte.oss.driver.internal.core.loadbalancing.PartitionAwarePolicy.AppendValueToChannel(PartitionAwarePolicy.java:351)
at com.yugabyte.oss.driver.internal.core.loadbalancing.PartitionAwarePolicy.getKey(PartitionAwarePolicy.java:309)
at com.yugabyte.oss.driver.internal.core.loadbalancing.PartitionAwarePolicy.getQueryPlan(PartitionAwarePolicy.java:113)
This stack trace then repeats itself frequently, until the pod restarts (I get about 400 MB of such stack traces). I am not sure whether there were perhaps too many threads running, but I myself didn’t start any. Maybe the driver does some parallelization, but my code inserts 10000 records at a time with a BatchStatement and then repeats.
Perhaps my cluster is too weak? I had to go way below the default CPU and memory requirements to even get YugabyteDB to install (see here for details). But even then, should we really be allowed to just crash after 1.3 million records - that sounds like a leak.
Since a batch statement is sent in a single RPC, I would suggest having smaller batches. And you can increase concurrency. Think batches of 500 and having 20 concurrent threads.
Also, if possible, sending records that have the same partition_key or that end up in the same tablet in the same batch would be the most efficient. This would remove multi-node transaction overhead.
Regardless of that, in this case, can you send some yb-tserver logs? To see why the server crashed?
So your code is single-threaded. cc @nmalladi if you understand the stack trace above hash key encoding failed?
Yes, the code is single-threaded. On my local machine (1 node, no cluster) I could insert 50000 records in a batch, on the remote cluster (3 nodes) I already reduced this to 10000. Obviously, for performance reasons I would change the code to multi-threaded later on, but I thought that it has to work single-threaded first - maybe a tad slow, but stable. The thought hadn’t occurred to me that going for smaller batches, but more threads could increase stability.
There are log files from that crash with all sorts of different messages. I am attaching a few examples out of context, but I think if someone wants to look at this in detail, I should reproduce it again and know which log files to send to whom, so that I am not sending irrelevant information. One thing that I did notice before the crash was that my K8s healthcheck (which writes a token to the DB and reads it back out) would occasionally fail (every couple of hours).
E0118 16:32:10.179522 24 tablet_metadata.cc:601] Failed to destroy regular DB at: /mnt/disk0/yb-data/tserver/data/rocksdb/table-00004000000030008000000000004120/tablet-e742e99f4b93427fb5def815d1d2c8ca: IO error (yb/rocksdb/util/env_posix.cc:317): /mnt/disk0/yb-data/tserver/data/rocksdb/table-00004000000030008000000000004120/tablet-e742e99f4b93427fb5def815d1d2c8ca/LOCK: No such file or directory
W0118 20:49:29.772193 11392 env.cc:87] Failed to cleanup /mnt/disk0/yb-data/tserver/data/rocksdb/table-00004000000030008000000000004009/tablet-a0960d9405ca44ee8ebe9e1b90df
04e0.intents/000013.sst.sblock.0: IO error (yb/rocksdb/util/env_posix.cc:238): /mnt/disk0/yb-data/tserver/data/rocksdb/table-00004000000030008000000000004009/tablet-a0960d
9405ca44ee8ebe9e1b90df04e0.intents/000013.sst.sblock.0: No such file or directory
W0118 17:46:54.166471 2799 consensus_peers.cc:543] T bedd801954ea4a77916104d0b34c6770 P 3966c68769644553aeaf5964a8c15e4e -> Peer cb9fd31bca0b4ec7aa5a749525e2873e ([host:
"yb-tserver-1.yb-tservers.yb.svc.cluster.local" port: 9100], [host: "yb-tserver-1.yb-tservers.yb.svc.cluster.local" port: 9100]): Couldn't send request. Status: Runtime e
rror (yb/common/wire_protocol.cc:281): (999 unknown): . Retrying in the next heartbeat period. Already tried 2 times. State: 2
W0118 17:46:54.234884 124 wire_protocol.cc:279] Unknown error code in status:
W0118 17:46:54.381820 2802 wire_protocol.cc:279] Unknown error code in status:
W0118 17:46:54.422660 2802 wire_protocol.cc:279] Unknown error code in status: