What is the behaviour of connection failure between masters in multi-master setup?


Forgive the naivity of these questions. I am quite new to developing distributed systems.

I am building a small distributed system for a small business with two servers, one local and one in the cloud (so that if local environment loses internet connection the client can still access local and if local fails the client can still access cloud). Both environments should be able to operate without the other.

If the local environment loses internet, there will be two functioning but disconnected servers. What would Yugabyte’s behaviour be in this instance? Will it create two independent clusters? When the connection is restored do they recluster? How could the data resynced? What if a record has been updated in both places?

I hope all of this makes sense. I may we’ll be way of the field in my understanding of these things. Any help appreciated!


Hi @guydewinton

Welcome to YugabyteDB Forum!

You need an odd number of servers and we recommend a minimum of 3 in a cluster. See replication docs: https://docs.yugabyte.com/latest/deploy/checklist/#replication.

In this case, the 2 remaining servers will continue serving reads + writes.

Technical Support Engineer

Another way is to use 2DC asynchronous replication. In case of conflicts, the last write wins.

Thank you for your reply. I have begun a deep dive into your docs as well as getting to know the RAFT consensus algorithm and Kubernetes. It is starting to make sense. I must say that I am blown away by the amount of intelligence that has gone into all of these technologies.

I have a few more questions that have been percolating on the way…

  1. If I have a three or five node cluster and one node goes down or there is a network partition, how would the remaining 2 or 4 nodes handle an additional failure (there now being an even number)?

  2. Can the client connect to any node which will forward the requests to the cluster master? If, for instance, the webserver shares a machine with a node, does it just connect to localhost?

  3. I am using Django to handle the requests. Can I just use my existing Postgres driver (same setup - just changing port and default username)?

  4. Could I make the nodes (2) on the local network read/write and have the cloud node read only so that if there is a network partition the local cluster will continue to function fully while the cloud will continue to serve reads? In this case is there some communication it could offer the client to let it know it is now read only?

I hope it is ok to dump these disparate questions here.

Many thanks.

Please see how replication works in https://docs.yugabyte.com/latest/deploy/checklist/#replication.

More generally, if RF is n , YugabyteDB can survive (n - 1) / 2 failures without compromising correctness or availability of data.

You can connect to any host. And the requests are always routed.

Yes. Please use version 2.2 which includes support for deferred constraints that are required by django.

This will work. We advise to have 3 regions in this case. Whatever region you lose, you will still be able to write data to the cluster. Is there a problem you can’t use 3 regions ?

You can also set preferred zones to the localhost node(s) if that’s your primary write node.

It’s ok!

Amazing. Thank you. Really useful and interesting. A lot to be getting on with!

1 Like