Write latency is bad for Yugabyte YCSB Benchmark

I have done a benchmark with YCSB.

Deployment details: 3 machine, Ubuntu 18.04, 16 core & 32gb memory for each. Deployment basically followed the same as manual deployment instructions.

Here are what concerned me:

Workload A: 10M records
[UPDATE], AverageLatency(ms), 16.760313642451805
[UPDATE], MinLatency(ms), 1.587
[UPDATE], MaxLatency(ms), 4202.495
[UPDATE], 95thPercentileLatency(ms), 8.167
[UPDATE], 99thPercentileLatency(ms), 441.087

Workload A 100M records:
[UPDATE], Operations, 4999256
[UPDATE], AverageLatency(ms), 2.4073149240406972
[UPDATE], MinLatency(ms), 1.593
[UPDATE], MaxLatency(ms), 7471.103
[UPDATE], 95thPercentileLatency(ms), 7.975
[UPDATE], 99thPercentileLatency(ms), 710.655

I have ran it on 1M, 10M, 50M, 100M, update has same problem, the 99th performance is bad, can someone share more details why write takes so much time?

Hi @Huiqing
Just to be clear, you were using our fork of YCSB (which uses our driver) from Benchmark YCQL performance using YCSB | YugabyteDB Docs, correct?

Yes, I am using your fork of YCSB. Benchmark YSQL performance with YCSB | YugabyteDB Docs

Are you running the benchmark in another machine from the cluster?
How are all the servers stats (CPU,memory,disk io) during the benchmark?

Yes, I am running benchmark in another machine.

Here is the screen shot of yugabyte server stats:

top - 22:15:43 up 16 days, 3:24, 6 users, load average: 45.47, 46.77, 46.51
Tasks: 656 total, 7 running, 478 sleeping, 0 stopped, 0 zombie
%Cpu0 : 61.3 us, 20.7 sy, 0.0 ni, 16.7 id, 0.3 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu1 : 64.0 us, 15.3 sy, 0.0 ni, 13.7 id, 0.0 wa, 0.0 hi, 7.0 si, 0.0 st
%Cpu2 : 61.2 us, 19.4 sy, 0.0 ni, 17.8 id, 0.6 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu3 : 59.9 us, 20.9 sy, 0.0 ni, 17.8 id, 0.7 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu4 : 60.9 us, 20.4 sy, 0.0 ni, 17.8 id, 0.3 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu5 : 59.9 us, 18.7 sy, 0.0 ni, 13.4 id, 0.0 wa, 0.0 hi, 8.0 si, 0.0 st
%Cpu6 : 64.2 us, 19.9 sy, 0.0 ni, 14.9 id, 0.3 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu7 : 65.8 us, 15.6 sy, 0.0 ni, 11.7 id, 0.3 wa, 0.0 hi, 6.5 si, 0.0 st
%Cpu8 : 64.8 us, 17.8 sy, 0.0 ni, 16.4 id, 0.3 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu9 : 60.9 us, 18.1 sy, 0.0 ni, 13.7 id, 0.3 wa, 0.0 hi, 7.0 si, 0.0 st
%Cpu10 : 65.4 us, 14.9 sy, 0.0 ni, 11.7 id, 0.0 wa, 0.0 hi, 8.1 si, 0.0 st
%Cpu11 : 61.4 us, 17.5 sy, 0.0 ni, 11.7 id, 0.3 wa, 0.0 hi, 9.1 si, 0.0 st
%Cpu12 : 62.7 us, 16.0 sy, 0.0 ni, 13.7 id, 0.0 wa, 0.0 hi, 7.7 si, 0.0 st
%Cpu13 : 62.3 us, 19.7 sy, 0.0 ni, 17.0 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu14 : 62.4 us, 15.8 sy, 0.0 ni, 12.9 id, 0.3 wa, 0.0 hi, 8.6 si, 0.0 st
%Cpu15 : 63.5 us, 19.1 sy, 0.0 ni, 17.1 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 32938472 total, 262412 free, 21619820 used, 11056240 buff/cache
KiB Swap: 999420 total, 362084 free, 637336 used. 10839776 avail Mem

@Huiqing note that you need to show some graphs of the cpus of all the servers at the same time the benchmark is running to easily see if any one of them is overrun.

What type of disk are you using? From what I have seen - when compaction kicks in, if the disk has limited IOPs/bandwidth, compaction eats away the available disk bandwidth and hence the writes have to wait. That is one possible reason why you see a higher P99. I had run these tests in the past and here are some results

Machine with 180 Mbps disk bandwidth (3 node cluster and 100M records)
[OVERALL], RunTime(ms), 5408025

[OVERALL], Throughput(ops/sec), 18429.401861123053

[INSERT], Operations, 99666666

[INSERT], AverageLatency(us), 16113.764145195746

[INSERT], MinLatency(us), 1286

[INSERT], MaxLatency(us), 3624959

[INSERT], 95thPercentileLatency(us), 23903

[INSERT], 99thPercentileLatency(us), 241151

[INSERT], Return=OK, 99666666

Machine with 480 Mbps disk bandwidth (3 node cluster and 100M records)

[OVERALL], RunTime(ms), 4858398

[OVERALL], Throughput(ops/sec), 20514.306567720472

[INSERT], Operations, 99666666

[INSERT], AverageLatency(us), 14350.927166089814

[INSERT], MinLatency(us), 1407

[INSERT], MaxLatency(us), 2732031

[INSERT], 95thPercentileLatency(us), 32271

[INSERT], 99thPercentileLatency(us), 57119

[INSERT], Return=OK, 99666666

Hey,

I checked that our disk is SSD. And you can see the disk IOPS as below:

22:00 - 12:45 AM is the time 350M get uploaded
12:45 - 2:00 AM is time benchmark is done, running workload a~f, except workload e

node 1:

node 2:

node 3:

For node-1 and node-3, it looks like the disk limit (300Mbps/6000 IOPs) are being hit. This resource exhaustion is likely the cause of higher P99 write latency. Reducing the concurrency (threadcount variable in YCSB) will help improve the P99 latencies.

We are working on some enhancements to better tune these bulk load scenarios.

1 Like