Support of 100.000+ connections

Prerequisites:
We have 100+ region groups and 1000+ of sources, every source is a service written in Golang. These services are sending data about air humidity, wind speed, etc. every 5 seconds. We plan to use Yugabyte as a main data source with limits for each group and have the ability of executing analytical queries to get aggregated weather data in NRT mode. For analytical queries we use Trino.
There are two questions:

  • What do you think, is yugabyte a good choice for our case?
  • If so, how to achieve 100.000+ concurrent connections? Due to documentation max_connections parameter is up to 300 connections per node. What potential drawbacks might be if manual increase of max_connections is done?

Thank you in advance!

Hey, you would need to use a server-side connection pooler especially if those connections are short-lived.

YugabyteDB has recently introduced the built-in connection manager. Even though it’s in the tech preview status, you’re still encouraged to give it a try and share your feedback.

Also, you can use one of the existing Postgres connection poolers such as PgBouncer or Odyssey. Check out this recording where we discussed and tested a few.

pgBouncer will not let us achieve this goal due to the fact that connections are not short-lived. We want to achieve exactly 100.000+ physical connections. Are there any other options? Have you tried to benchmark more than 9000 connections with tpcc?

pgBouner should work fine with long lived connections. Denis mentioned “short lived” because just creating a new connection has overhead.

But you haven’t specified a reason why you need them all to be physical?

I’ve tested on my laptop with pgbouncer with 20K connections (only doing simple SELECT queries).

Using the connection pooler that Denis mentioned (or 1+ pgbouncers in each yb-tserver) should support ~10K+ connections per yb-tserver. Still depends on the hardware and what the queries do.

Another idea is to have a connection pooler/proxy also in the middle. Say, 1 proxy for every region or for every 1000 clients.

Does built-in connection manager support permission rights managment per user?
Also pgBouncer just reuses created connections, if they are not busy. There is no doubt that it will slightly increase number of connections but the goal won’t be reached. But the workload we have with long-term busy connections won’t allow us to reuse created connections to improve overall number of connections, it will just decrease overhead of creating new connections. Please, correct me if I’m wrong

UPD: It is important for us to have paralell connections, not concurrent(made a typo in termin earlier)
We are interested in cluster of 5000 nodes with the ability of 150k+ qps. INSERT/UPDATE transaction size is ~ 500Kb. 50% of INSERT and 50% of UPDATE transactions (±5%)

Can you be more specific about which feature you mean here? It should be able to support multiple users efficiently in the same pool Connection Manager in YSQL | YugabyteDB Docs

I’m curious to know why you think so. Example: you mentioned “send data every 5 seconds” which is very low activity.

Can you explain this workload in detail? Feel free to explain all workloads in as much detail as possible to get a more concrete answer.

I think you want both. For example, 100K concurrent connections sending 1 query every 5 seconds, will be 20K qps, if each query lasts 10ms, and the queries are spread equally in time, you will have 100K / 5s * 0.01s = 200 parallel physical connections/queries at the same time.

150K+ qps and each writing 500KB, meaning ~ 71.5GB of raw bytes/second inserts?

Isn’t that like 150K/5K = 30qps/per/node? Seems like low query load for each node.
What hardware are you thinking for each node?
Can you explain more the requirements of the cluster (multi region? multi-az? single region? row-level-replication? how many replicas?)

Hi @Meteo to avoid any confusion, the max_connection is per database node, so with the default setting, you can have 100000+ connections if you deploy to 350 nodes.

You can also increase it. max_connections=300 is set by default to limit the memory allocation. Each physical postgres backend allocates some memory, which is variable depending on the usage. For example, when metadata is read from the catalog, it stays in cache. Currently, our deployment recommends disabling swap at the OS level and then idle connections use resident memory even when idle.

max_connections can be increased, but this will use more memory, not very useful for idle sessions, when the server benefits from available memory for filesystem buffers.

So, even if it is possible, it may be better to see if there are some connection pooling options that can help. It will be more scalable, and saves on costs.

For that we need to know how connections are used and which state / properties must stay during those long-lived connections. Like temporary tables? Prepared statements? session parametrs? transactions staying open? Or maybe you need different users/roles for security reasons?

1 Like