Accessing YugaByteDB on k8s over 9042 externally

Hello ,
We are running evaluation for YugaByte in k8s platform. We are using a Java client (using a Java DSE/Cassandra driver) to access DB (cassandra compatible) on 9042 (internally from same k8s namespace) and externally from outside of k8s.

Our observation and question -
a) While trying to access Yugabyte internally using a java client (microservice as sidecar containers within same k8s namespace) , we are obviously able to connect using tservers’ POD IPs. But problem with this approach is , POD IP is not static and will change eventually. So what is the best approach to connect to YB using a java cassandra like driver where cassandra driver needs to have knowledge of all tservers IPs participating in the YugaByte cluster? Can we use K8s cluster IP (instead of POD IPs) as tservers contact points? Will that have ability to route the traffic to right PODs from DSE/Cassandra DB driver?

b) While trying to access YugaByte externally by a java client using cassandra DB driver, we created a a LB against db-service (cassandra :9042) and were utilizing LB endpoint. We are able to access only one tserver POD IP. Once again , this is because of how the Java DSE/Cassandra driver works as it needs to have the knowledge of all POD IPs. What is the best practice to access YB from outside K8s ?BTW, our external IP in K8s is not a public IP , it’s Private and not accessible from outside K8s.

1 Like

Hello @palashg, glad you are trying out YB.

To answer your questions,

(a) You can either use the cluster IP as a contact point (preferable) or you can use the full list of pod DNS addresses as your client contact points. Pod DNS is stable because the default helm chart uses a StatefulSet to deploy the pods - however you may scale up/down the cluster to invalidate some of these pods, so a cluster IP avoids that problem.

(b) With a GKE load balancer, I see that connections through a LB are sent to individual tservers in a round robin fashion - I’ve shared an example with ysqlsh below. For the Cassandra driver, it should be unable to reach individual pod IPs from outside, so we would expect it to initiate multiple connections to the LB which should get mapped to different pods. Is there a connection pool size limit of some kind? How did you see that only one tserver pod was used?

 inet_server_addr
------------------
 10.48.8.5
(1 row)

sanketh@varahivm yugabyte-db > master >  ~/yugabyte-2.19.2.0/bin/ysqlsh -h 34.145.17.130 -c "select inet_server_addr()"
 inet_server_addr
------------------
 10.48.0.11
(1 row)

sanketh@varahivm yugabyte-db > master >  ~/yugabyte-2.19.2.0/bin/ysqlsh -h 34.145.17.130 -c "select inet_server_addr()"
 inet_server_addr
------------------
 10.48.0.11
(1 row)

sanketh@varahivm yugabyte-db > master >  ~/yugabyte-2.19.2.0/bin/ysqlsh -h 34.145.17.130 -c "select inet_server_addr()"
 inet_server_addr
------------------
 10.48.8.5
(1 row)

Hello @palashg
Thanks for trying out YugabyteDB

Couple of additional things to check would be:

Selectors in the service are configured correctly to match all pods part of the backend.

SessionAffintiy is not configured on the Service.

1 Like

Thank you for the quick response.

A) here is a standard selector configuration referreing to tserver (POD part)

selector:
app: yb-tserver
type: LoadBalancer

No session affinity is configured.

B) ClusterIP as contact point is not working - once again, java driver is looking for direct POD Ip.

Caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you’ve provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=/xxx.xxx.xxx.xxx:9042, hostId=null, hashCode=60db06c2): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|connecting…]

We will try to see how to implement the other option (POD DNS address).

Just to clarify, are you using ClusterIP for internal access - case (A) above or external access - case (B) above?

Regarding the LB, what kind of a LB controller are you using? Also, how do you see that a single tserver pod gets all the traffic - is this what the metrics show?

are you using ClusterIP for internal access - case (A) above or external access - case (B) above?

Response - Case (A) above for internal access. We can’t access Cluster IP in case (B) as K8s Cluster IP is a private IP and not accessible from outside external to k8s.

For case(B) , well, when our application is running outside K8s , application log (Cassandra DB driver log) clearly says that it cann’t reach out to all the other IP PDs except one. It seems, DB driver has its own LB logic. Once it has all the IPs from LB end points, it’s trying to connect directly to tserver PODs with their respective IPs.

I will come back to you about LB we are using - this is not GKE.

This might be a very common scenarios , I am wondering how others are solving this. YugaByte has its own Java CQL driver - its implementation will be lot diffrent than DSE (or Cassandra driver)?
Secondly , what DB driver you were using in your test ? We are trying to access from a Java client using DSE/Cassandra driver. We are not using CQL YB command line.