How to create a multi-cluster deployment on Kubernetes?

Hi there,

We are currently operating Kubernetes clusters (1.12+) in GKE and AWS (kops). We are looking to expand to Alibaba for China and Azure for Africa. We are looking for a globally distributed, strongly consistent database. We are interested in YugabyteDB primarily because of YEDIS, which allows us to start using YugaByteDB with next to no rewrites needed in the code (except we need to use smembers instead of SSCAN which can negatively affect read performance, at least on classic Redis).

We would like to deploy YugaByteDB using helm charts, but we do have a couple of questions.

  1. Why is there no repository available for YugaByteDB?
  2. How do we create a multi-cluster setup using the Helm charts, at least on GKE so we can start testing it in multiple regions?
  3. How performant is Yedis compared to the other APIs?

Looking forward to having a discussion about this topic! Cheers

What do you mean by no respository for YugaByteDB, we do have source code repository on github. And the docker images on dockerhub

Source Code
Docker image

I am assuming you guys have setup multi region cluster in GKE which can talk to each other, right now GKE out of the box supports regional clusters and zonal clusters.

I would let @amitanand or @karthik share the performance comparison.

Many companies such as Bitnami have a public chartmuseum repository to make it easy and straightforward to install software using helm. See: https://github.com/bitnami/charts.

That way we don’t have to clone the repo or create some kind of CI integration for this in order to keep our software up-to-date. Also it makes it more straightforward to deploy the software for everyone.

They don’t actually talk to each other, that would probably have to be done using a service mesh or public load balancers.

ah, yes, i am currently working on that, should wrap that up by End of this month or so, we have forked the opensource charts repo for now yugabyte chart

Hi @thecodeassassin-mycu,

I think this is the core question. @ramkumar will post the set of steps to get you going.

Here is the design doc for it: Multi-Zone and Multi-Cluster YB deployment in Kubernetes

Note the following in case of multi-cluster kubernetes setup:

the clusters have connectivity amongst themselves, that is, the pods in one cluster can resolve the FQDNs of the pods in the other cluster(s)

To answer your question on Yedis:

Yedis is comparable in performance to the other APIs. Simple key-value read operations are generally sub-millisecond. However, note that no active work is happening on Yedis at the moment, please read the note here for more details: https://docs.yugabyte.com/latest/yedis/

Here is the instructions on how you can run YugaByte db on a regional cluster (multi zone) on GKE. This same instructions can be used for multi region clusters as well.

Pre-Req:

  1. Create a Regional cluster with minimum 4 nodes, with 8 CPU each
  2. Clone YugaByte charts repository: https://github.com/YugaByte/charts.git
  3. Make sure helm chart 2.8.0+ is installed.

Installation Steps:
Step1: Create Storage Class and Overrides template files
$ mkdir yb-multi-az
$ echo 'kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: standard-##zone## provisioner: kubernetes.io/gce-pd parameters: type: pd-standard replication-type: none zone: ##zone##"' > yb-multi-az/storage-class-template.yml

$ echo 'isMultiAz: True AZ: "##zone##" storage: master: storageClass: "standard-##zone##" tserver: storageClass: "standard-##zone##" masterAddresses: "yb-master-0.yb-masters.yb-demo-##zone1##.svc.cluster.local:7100, yb-master-0.yb-masters.yb-demo-##zone2##.svc.cluster.local:7100, yb-master-0.yb-masters.yb-demo-##zone3##.s vc.cluster.local:7100" replicas: master: 1 tserver: 1 totalMasters: 3 gflags: master: placement_cloud: "kubernetes" placement_region: "##region##" placement_zone: "##zone##" tserver: placement_cloud: "kubernetes" placement_region: "##region##" placement_zone: "##zone##"' > yb-multi-az/overrides-template.yml

Step 2: Setup RBAC
$ cd charts/stable/yugabyte/
$ kubectl apply -f yugabyte-rbac.yaml

Step 3: Generate Zone specific overrides
First fetch the zone and region labels by running this command
$ kubectl get nodes -Lfailure-domain.beta.kubernetes.io/region -Lfailure-domain.beta.kubernetes.io/zone

ex: lets say your region label is us-west1 and zone labels are us-west1-a, us-west1-b, us-west1-c
you would generate the zone specific overrides like below

Below command shows for one zone(us-west1-a), Repeat this step for each of the zones.
$ sed 's/##zone##/us-west1-a/g' storage-class-template.yml > yb-multi-az/storage_class_us-west1-a.yml
$ sed 's/##zone##/us-west1-a/g; s/##region##/us-west1/g; s/##zone1##/us-west1-a/g; s/##zone2##/us-west1-b/g; s/##zone3##/us-west1-c/g' overrides-template.yml > yb-multi-az/overrides-us-west1-a.yml

Step 4: Create the storage class
Note: repeat this command for all the zones
$ kubectl applly -f yb-multi-az/storage_class_us-west1-a.yml

Step 5: Initialize helm and have it use the service account we created on Step 2:
$ helm init --service-account yugabyte-helm --upgrade —wait

Step 6: Run helm install.
Note: repeat this command for all the zones
$ helm install charts/stable/yugabyte --namespace yb-demo-us-west1-a --name yb-demo-us-west1-a -f yb-multi-az/overrides-us-west1-a.yml

Step 7: Wait for all the containers to come up
Below command would list all the pods that are specific to yugabytedb
$ kubectl get pods --all-namespaces -lcomponent=yugabytedb

Step 8: Access the db admin UI
Below command would fetch all the services which are expose via loadBalancer.
$ kubectl get svc --all-namespaces -lcomponent=yugabytedb --field-selector=metadata.name=yb-master-ui

If you run into issues please hop on our community slack channel and one of us can help debug

@thecodeassassin-mycu - let us know if the above instructions worked for you. Please feel free to reach out if you need any help.

@karthik Could you perhaps post the actual yaml files? It’s hard to format them correctly.

Also it’s still not really clear to me how to apply this to multiple clusters and have them appear as a single YugabyteDB. We already run a regional cluster so most of the pods are already spread out correctly so this doesn’t really solve anything for us.

Yes, totally @thecodeassassin-mycu

@arnav - could you please post yaml files for this scenario?

@thecodeassassin-mycu Below is the overrides file snippet. I’ve also added comments to hopefully make the YAML more meaningful and hopefully answer your question on how the multiple helm deployments over multiple clusters combine into a single universe.

# To signify if the deployment spans multiple zones. 
isMultiAz: True

# Set if you would like to deploy this deployments pods specifically on nodes in a particular zone
# (affinitized by using node label: failure-domain.beta.kubernetes.io/zone)
AZ: "<zone>"

# Since in a single cluster spanning multiple zones, the PVC cannot be controlled by the helm deployment.
# So to pin the pods to the required zone, we need to ensure that there is a storage class created just for that zone.
storage:
  master:
    storageClass: "standard-<zone>"
  tserver:
    storageClass: "standard-<zone>"

# The master addresses is how the YugaByteDB finds the rest of the members of the requested universe which may have been deployed
# in different helm deployments (and by extension, in different zones/clusters)
masterAddresses: "yb-master-0.yb-masters.yb-demo-<zone1>.svc.cluster.local:7100, yb-master-0.yb-masters.yb-demo-<zone2>.svc.cluster.local:7100, yb-master-0.yb-masters.yb-demo-<zone3>.svc.cluster.local:7100"

# The number of pods corresponding to each service that you would like to bring up in the particular helm deployment.
# The totalMasters specifies the total number of masters in the universe (so essentially the required replacement factor)
# The number of masters also specifies the min number of replicas in that zone, so always masters <= tservers.
replicas:
  master: 1
  tserver: 1
  totalMasters: 3

# These are the flags that specifies to the master the placement information, and ensures that data is replicated according to
# the user's request.
gflags:
  master:
    placement_cloud: "kubernetes"
    placement_region: "<region>"
    placement_zone: "<zone>"
  tserver:
    placement_cloud: "kubernetes"
    placement_region: "<region>"
    placement_zone: "<zone>"

The snippet for creating the storage class for a particular zone is the following:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard-us-west1-a
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
  replication-type: none
  zone: us-west1-a
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard-us-west1-b
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
  replication-type: none
  zone: us-west1-b
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard-us-west1-c
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
  replication-type: none
  zone: us-west1-c
---