What happens if i run more yb master nodes than the replication factor

I am trying to setup Yugabyte and was writing automation to setup a cluster.
The automation would become a bit complicated if had to manage master vms and tserver vms separately, setting up automation to bring up both master and tserver nodes on the same machine is straightforward.

The issue I am wondering about is what if i need a cluster with 7 tservers and also bring up 7 master servers along with them and create a cluster. [Replication Factor - 3]
All documentation i have come across just mentions that you should have master nodes equal to RF. what happens if i maintain a larger number of master nodes. what will go wrong (or any potential issues that might come up)?

Hi @Manish

How are you deploying? Using yugabyted?

Hi @dorian_yugabyte Deploying manually as mentioned here: Manual deployment of YugabyteDB clusters | YugabyteDB Docs

I am trying to setup deployment automation via chef cookbooks.

Currently yugabyted starts an yb-master in every node, more than RF. It keeps the yb-masters after RF not connected to the cluster and joins them when it sees that there are less than RF.

But we haven’t documented how it works so you have to look at the source code for now if you want to replicate.

I understand that I will have to implement this myself in the cookbook.
What I wanted to understand about more is why is there a need to restrict the number of masters equal to RF?
What will go wrong if I run more masters than RF?
I could not find answer to this anywhere.

Because yb-masters form a RAFT cluster and handle replication/failover/consistency of the metadata (just like yb-tservers do for data).

Read YB-Master service | YugabyteDB Docs for more info.

Hi @Manish, You can start more yb-master if it makes your automation easier, but do not list more than the RF in --master_addresses and tserver_master_addrs. Because it will increase the replication factor of the yb-master’s tablet and slow down the writes to the yb-master (like DDL) because it will wait for the acknowledgment of the majority.

Just like OP, i did not have a clue that yugabyted will simply keep the T-Master higher then the RF(3), as hot standby.

This ties in to the Yugubyte Smart drivers, as they get updated with the promoted T-Masters. So if your not using the Smart drivers, multiple T-Master outage (lets say updating one by one the servers), may result in software losing access to the (retired) T-Masters (as now different T-masters are active). But this is not clearly communicated (imho) to use the smart drivers because of this behavior.

To be honest, a lot of information about Yugubyte, is rather confusing as its a mix of older T-Server/T-Master, and the newer yugabyted documentation.

Its one of those things that made me try CockroachDB first, as its installation was easier, where as Yugubyte goes down a entire path of T-Server, T-Masters, how many master, … and LLM’s do not pick yugabyted as the first option.

Note: Might be interesting to show this visually in the UI… Like Active Masters 3/3, Hot Spares 5/5.

Correct.

The smart drivers don’t connect to yb-masters. No driver does.

yugabyted just starts yb-tserver/master and manages them. It’s tserver/master all the way down.

It’s true that having a single server-type is easier to manage. While having 2 layers (data & metadata) is better in efficiency in larger clusters.

I’ve found LLMs a bit unhelpful. They are also old depending when they crawled.

The Hot Spares aren’t part of the cluster though. They’re only known by yugabyted cli.

But they are hot spares, what technically makes then part of a cluster universe? Do the yugabyted servers also communicate between each others, as to allow for the hot-spare activation?

You mean the layer above the T-Masters, like the YSQL and YCQL… As those are the client communication layers?

This complexity of layers explains why there are so many ports involved. lol

@Benjiro

But they are hot spares, what technically makes then part of a cluster universe? Do the yugabyted servers also communicate between each others, as to allow for the hot-spare activation?

yugabyted creates shell masters on all the nodes. Shell masters are not part of the universe, and in no way interact with the cluster. Shell master gets promoted to masters or gets stepped down only when the yugabyted configure data_placement command is executed. yugabyted does not communicate with each other, it’s a local daemon only.

The configure data_placement command gets the details of --cloud_location and --fault_tolerance from all the nodes in the cluster to compute the placement location of masters, and places the yb-master process in appropriate fault domains. This allows the flexibility to users to bring the yugabyted nodes in any order and doesn’t force the user to start the nodes in appropriate fault domain, as is the case with yb-master/yb-tserver way of deployment.

Thanks for the feedback on docs. We are making yugabyted first class citizen in docs, most of the docs should be refreshed with yugabyted steps by 2025.1 release.

Thanks.
Nikhil Chandrappa
Lead Engineer, YugabyteDB OSS

1 Like