We have cluster consisting of 40 nodes (3 master nodes, 40 t-server nodes). Each machine is 64 CPU, 256 Gb RAM, 6 nvme disks (6.4 Tb each).
1 database with 2,000 non-colocated tables (50gb - 30tb). Some tables are partitioned by 100+ partitions. The number of tablets on each server is about 10,000+. The yugabyte’s config is default.
When executing “select one row by index”-like queries we’ve faced abnormally high latency for the majority of queries (~13-15 secs and even more). Also we found out that there is high cpu usage (15-20% on each node) without any workload. There are no such problems when performing operations on an empty cluster.
Could you give us some advice or best practices for this particular case? We tried to modify config according to “best practices” in Yugabyte Docs, but it did not help.