Yb master cluster balancer

We recently had an issue on our yugabyte cluster, where we saw something similar to an election storm.
We saw that there were too many tablet elections happening, and also saw that yb-master’s cluster balancer was also triggering a lot of leader rebalancing, (which was also adding to more elections, therefore making the situation worse).
We were able to stablise the cluster by identifying a tserver with large number of threads as compared to other tservers and stopping that tserver. [We don’t know what started the election storm]

As part of Action Items to mitigate this from happening again we were wondering if changing the value for leader_balance_threshold from default 0 to a higher value would help.

yugabyte db version: 2024.2.0.0

we tried this config change on our test environment.
before: [User Tablet-Peers / Leaders]
- tserver1: 445 / 148
- tserver2: 445 / 148
- tserver3: 445 / 149

  • Updated the leader_balance_threshold to 3 on all yb-master vms
  • blacklisted tserver1 using change_leader_blacklist command
    • this made tserver1: 445 / 0, and increased leaders on other 2 tservers
  • removed blacklist for tserver1

expectation was:
- tserver1: 445 / 147
- tserver2: 445 / 149
- tserver3: 445 / 149

what I got:
- tserver1: 445 / 40
- tserver2: 445 / 202
- tserver3: 445 / 203

redid the whole thing with leader_balance_threshold to 2, same result
redid the whole thing with leader_balance_threshold to 1, result was
- tserver1: 445 / 90
- tserver2: 445 / 177
- tserver3: 445 / 178

I am wondering if leader balancer works on each table level, balancing and honouring the threashold set for tablets of each table. instead of on the cluster level. documentation does not say anything about this.

Hi Manish,

Yes, leader_balance_threshold works at the table level. This is a gap in our documentation, as you pointed out. I have opened a PR to fix it.

Anubhav Srivastava (YB)

As part of Action Items to mitigate this from happening again we were wondering if changing the value for leader_balance_threshold from default 0 to a higher value would help.

@Anubhav_Srivastava can you clarify on this, my understanding is that this would help during an election storm. If we increase the value high enough, this would lead to reduction in the yb_master triggered cluster balancing, leading to less number of elections. Am i right on this?

Increasing the leader_balance_threshold flag might help here, but it wouldn’t fix the underlying cause of the election storm. If you can figure out what is causing that (e.g., a certain query causing high load on a tserver, an internal deadlock, etc.) that would be useful.

If you’re looking for more of a patch fix, a more typical approach is to lower the value of load_balancer_max_concurrent_moves to something like 5, so that the cluster balance makes fewer concurrent leader moves in each iteration (the cluster balancer runs around once a second, so that is ~5 leader moves / s). Note that this would increase the amount of time for operations like leader blacklisting and moving leaders back to failed nodes.