Write latency is bad for Yugabyte YCSB Benchmark

Huiqing · November 17, 2021, 7:56pm

I have done a benchmark with YCSB.

Deployment details: 3 machine, Ubuntu 18.04, 16 core & 32gb memory for each. Deployment basically followed the same as manual deployment instructions.

Here are what concerned me:

Workload A: 10M records
[UPDATE], AverageLatency(ms), 16.760313642451805
[UPDATE], MinLatency(ms), 1.587
[UPDATE], MaxLatency(ms), 4202.495
[UPDATE], 95thPercentileLatency(ms), 8.167
[UPDATE], 99thPercentileLatency(ms), 441.087

Workload A 100M records:
[UPDATE], Operations, 4999256
[UPDATE], AverageLatency(ms), 2.4073149240406972
[UPDATE], MinLatency(ms), 1.593
[UPDATE], MaxLatency(ms), 7471.103
[UPDATE], 95thPercentileLatency(ms), 7.975
[UPDATE], 99thPercentileLatency(ms), 710.655

I have ran it on 1M, 10M, 50M, 100M, update has same problem, the 99th performance is bad, can someone share more details why write takes so much time?

dorian_yugabyte · November 18, 2021, 8:09am

Hi @Huiqing
Just to be clear, you were using our fork of YCSB (which uses our driver) from Benchmark YCQL performance in YugabyteDB with YCSB | YugabyteDB Docs, correct?

Huiqing · November 18, 2021, 5:42pm

Yes, I am using your fork of YCSB. Benchmark YSQL performance using YCSB | YugabyteDB Docs

dorian_yugabyte · November 19, 2021, 9:33am

Are you running the benchmark in another machine from the cluster?
How are all the servers stats (CPU,memory,disk io) during the benchmark?

Huiqing · November 19, 2021, 10:16pm

Yes, I am running benchmark in another machine.

Here is the screen shot of yugabyte server stats:

top - 22:15:43 up 16 days, 3:24, 6 users, load average: 45.47, 46.77, 46.51
Tasks: 656 total, 7 running, 478 sleeping, 0 stopped, 0 zombie
%Cpu0 : 61.3 us, 20.7 sy, 0.0 ni, 16.7 id, 0.3 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu1 : 64.0 us, 15.3 sy, 0.0 ni, 13.7 id, 0.0 wa, 0.0 hi, 7.0 si, 0.0 st
%Cpu2 : 61.2 us, 19.4 sy, 0.0 ni, 17.8 id, 0.6 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu3 : 59.9 us, 20.9 sy, 0.0 ni, 17.8 id, 0.7 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu4 : 60.9 us, 20.4 sy, 0.0 ni, 17.8 id, 0.3 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu5 : 59.9 us, 18.7 sy, 0.0 ni, 13.4 id, 0.0 wa, 0.0 hi, 8.0 si, 0.0 st
%Cpu6 : 64.2 us, 19.9 sy, 0.0 ni, 14.9 id, 0.3 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu7 : 65.8 us, 15.6 sy, 0.0 ni, 11.7 id, 0.3 wa, 0.0 hi, 6.5 si, 0.0 st
%Cpu8 : 64.8 us, 17.8 sy, 0.0 ni, 16.4 id, 0.3 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu9 : 60.9 us, 18.1 sy, 0.0 ni, 13.7 id, 0.3 wa, 0.0 hi, 7.0 si, 0.0 st
%Cpu10 : 65.4 us, 14.9 sy, 0.0 ni, 11.7 id, 0.0 wa, 0.0 hi, 8.1 si, 0.0 st
%Cpu11 : 61.4 us, 17.5 sy, 0.0 ni, 11.7 id, 0.3 wa, 0.0 hi, 9.1 si, 0.0 st
%Cpu12 : 62.7 us, 16.0 sy, 0.0 ni, 13.7 id, 0.0 wa, 0.0 hi, 7.7 si, 0.0 st
%Cpu13 : 62.3 us, 19.7 sy, 0.0 ni, 17.0 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu14 : 62.4 us, 15.8 sy, 0.0 ni, 12.9 id, 0.3 wa, 0.0 hi, 8.6 si, 0.0 st
%Cpu15 : 63.5 us, 19.1 sy, 0.0 ni, 17.1 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 32938472 total, 262412 free, 21619820 used, 11056240 buff/cache
KiB Swap: 999420 total, 362084 free, 637336 used. 10839776 avail Mem

dorian_yugabyte · November 22, 2021, 8:18am

@Huiqing note that you need to show some graphs of the cpus of all the servers at the same time the benchmark is running to easily see if any one of them is overrun.

Hemant_Bhanawat · November 22, 2021, 9:01am

What type of disk are you using? From what I have seen - when compaction kicks in, if the disk has limited IOPs/bandwidth, compaction eats away the available disk bandwidth and hence the writes have to wait. That is one possible reason why you see a higher P99. I had run these tests in the past and here are some results

Machine with 180 Mbps disk bandwidth (3 node cluster and 100M records)
[OVERALL], RunTime(ms), 5408025

[OVERALL], Throughput(ops/sec), 18429.401861123053

[INSERT], Operations, 99666666

[INSERT], AverageLatency(us), 16113.764145195746

[INSERT], MinLatency(us), 1286

[INSERT], MaxLatency(us), 3624959

[INSERT], 95thPercentileLatency(us), 23903

[INSERT], 99thPercentileLatency(us), 241151

[INSERT], Return=OK, 99666666

Machine with 480 Mbps disk bandwidth (3 node cluster and 100M records)

[OVERALL], RunTime(ms), 4858398

[OVERALL], Throughput(ops/sec), 20514.306567720472

[INSERT], Operations, 99666666

[INSERT], AverageLatency(us), 14350.927166089814

[INSERT], MinLatency(us), 1407

[INSERT], MaxLatency(us), 2732031

[INSERT], 95thPercentileLatency(us), 32271

[INSERT], 99thPercentileLatency(us), 57119

[INSERT], Return=OK, 99666666

Huiqing · November 23, 2021, 10:21pm

Hey,

I checked that our disk is SSD. And you can see the disk IOPS as below:

22:00 - 12:45 AM is the time 350M get uploaded
12:45 - 2:00 AM is time benchmark is done, running workload a~f, except workload e

node 1:

node 2:

node 3:

Hemant_Bhanawat · November 24, 2021, 5:46am

For node-1 and node-3, it looks like the disk limit (300Mbps/6000 IOPs) are being hit. This resource exhaustion is likely the cause of higher P99 write latency. Reducing the concurrency (threadcount variable in YCSB) will help improve the P99 latencies.

We are working on some enhancements to better tune these bulk load scenarios.

Topic		Replies	Views
YCSB benchmarking Test General	1	873	November 24, 2021
YCSB Benchmark Results for YugaByte and Apache Cassandra (again) with P99 Latencies General	1	6840	March 8, 2023
YugaByte DB P99 latencies with Netflix Data Store Benchmark General	1	3799	March 8, 2023
Large cluster perf #1 - 25 nodes General	5	2890	April 21, 2020
Does YB offer any YCQL performance test tool? General	1	756	December 11, 2020

Write latency is bad for Yugabyte YCSB Benchmark

Related topics