Deploy questions

Hi

I am new here, and looking to setup some test deployment on VPS I have at my provider (non GCP, Azure, Amazon).

Some things I am missing from the docs (I guess):

  1. I will use 3 nodes; I will install them one by one; but in Manual Deployment docs I see I have to know all the IP addresses in advance that I will use. Is this correct?
  2. Further on point above; so what if I later extend the nodes; what do I have to do to add the nodes - do I need to re-install the first 3 nodes?
  3. So I will have 3 nodes, 3 public IP addresses; I guess I have to setup a LoadBalancer to balance the requests over the 3 nodes; no hint in the documentation about this; also the GCP script does not contain any load balancing hints; otherwise how would I connect to the cluster through my DEV code (choose one IP address does seem to be against the distributed idea of failing nodes…)
  4. What about updates? What if YB releases a new version; how does this work? I stop 1 node, update it to a new version, bring it back up, and then move to the next node?

Thanks for some clarifications on the above.

Hi @stefandevo

The minimum will be 3 nodes with Replication Factor 3, so you’ll have available their ips from the start, no ?

See how to change cluster config by adding/removing nodes: Change cluster configuration | YugabyteDB Docs.

You just need to update their configuration so they’re aware of the new nodes.

For YCQL(Cassandra) you don’t need a load balancer, you can list all ips on your client driver and it handle failover.

For YSQL, you can use any TCP load balancer or you can use multiple hosts in the connection string driver feature.

See steps here: Upgrade a deployment | YugabyteDB Docs

Hi @dorian_yugabyte thanks for the answers!

Coming back to the load balancer question; so basically the client drivers do the load balancing then? How do they decide which IP to use, or will it just take the first, if it fails, take the second. In this case, if all nodes are up, all the “load” comes to the first IP address? Or am I missing something?

In YCQL, the clients are “smart” and will contact the correct node with the correct partition.

This happens by default in YSQL. This can be mitigated by shuffling the list of ips before creating a new connection. The driver will always connect 1 by 1, but the order of ips won’t be the same.

@stefandevo we’re also working on a smart JDBC driver for YSQL GitHub - yugabyte/jdbc-yugabytedb: JDBC Driver for Yugabyte SQL (YSQL).

My dev environment is C# .NET Core so not sure this will be something useful for me. But I keep an eye on it. I can easily add a load balancer before the nodes in the meantime.

1 Like

Hi Stefandevo,
I’ll answer your questions here as best I can.

  1. Yes, you need the IP addresses of all the nodes, so each node can access the others.
  2. You can add a fourth (and fifth, etc) node to the existing cluster - you don’t need to remove the existing nodes.
  3. It’s not mandatory to use a load balancer, although we do recommend it for production environments. A client can connect to any active cluster node and issue queries for the whole database. If you use Yugabyte specific drivers (eg for Java) or the Yugabyte specific CLI (ysqlsh) these can be given the IP addresses of all active nodes in their connect strings, so they can connect to multiple nodes, for improved performance and resilience.
  4. With YugabyteDB community edition, updates are a manual process, as you describe. With a licensed support contract, you have access to the Yugaware Platform environment which automates all aspects of cluster creation, expansion, update, backup, etc.

I hope this helps, and thank you for using YugabyteDB

Best regards

Doug

1 Like