Multiregion deployment with single cluster/universe

Hi There,

Here is the requirement for a multi-region deployment. This will provide me balanced data across AZs, scaling up interms of adding more nodes to individual AZs or adding more AZs.

Requirement:

create a tablespace in a 3x3 cluster (3 nodes per region across ms-1, mr-1, st-1, totaling 9 nodes) with 5 replicas with region survival goal:

  • 2 replicas in ms-1 across any 2 AZs of ms-1a, ms-1b, ms-1c.
  • 2 replicas in mr-1 across any 2 AZs of mr-1a, mr-1b, mr-1c.
  • 1 replica in st-1 across any 1 AZ of st-1a, st-1b, st-1c.
  • Balanced data distribution to avoid zones like ms-1c having minimal data.

I am getting error with zone=‘*’ or zone=‘AZ1,AZ2,AZ3’ while creating tablespace.

My requirement is to features as below and would like Yugabyte to take care of distributing the data/tablets evenly across all AZs in each region. CockroachDB has this feature.

CREATE TABLESPACE geo_partitioned_leader_aware WITH (
replica_placement=‘{
“num_replicas”: 5,
“placement_blocks”: [
{ “cloud”: “aodc”, “region”: “ms-1”,“zone”: “", “min_num_replicas”: 2 },
{ “cloud”: “aodc”, “region”: “mr-1”,“zone”: "
”, “min_num_replicas”: 2 },
{ “cloud”: “aodc”, “region”: “st-1”,“zone”: “*”, “min_num_replicas”: 1 }
]
}’
);

Even this doesn’t work looks like the min_num_replicas can’t be 0

ms_testdb=# CREATE TABLESPACE geo_partitioned_leader_aware_3 WITH (
ms_testdb(# replica_placement=‘{
ms_testdb’# “num_replicas”: 5,
ms_testdb’# “placement_blocks”: [
ms_testdb’# {“cloud”: “aws”, “region”: “ms-1”, “zone”: “ms-1a”, “min_num_replicas”: 1},
ms_testdb’# {“cloud”: “aws”, “region”: “ms-1”, “zone”: “ms-1b”, “min_num_replicas”: 0},
ms_testdb’# {“cloud”: “aws”, “region”: “ms-1”, “zone”: “ms-1c”, “min_num_replicas”: 0},
ms_testdb’# {“cloud”: “aws”, “region”: “mr-1”, “zone”: “mr-1a”, “min_num_replicas”: 1},
ms_testdb’# {“cloud”: “aws”, “region”: “mr-1”, “zone”: “mr-1b”, “min_num_replicas”: 0},
ms_testdb’# {“cloud”: “aws”, “region”: “mr-1”, “zone”: “mr-1c”, “min_num_replicas”: 0},
ms_testdb’# {“cloud”: “aws”, “region”: “st-1”, “zone”: “st-1a”, “min_num_replicas”: 1},
ms_testdb’# {“cloud”: “aws”, “region”: “st-1”, “zone”: “st-1b”, “min_num_replicas”: 0},
ms_testdb’# {“cloud”: “aws”, “region”: “st-1”, “zone”: “st-1c”, “min_num_replicas”: 0}
ms_testdb’# ],
ms_testdb’# “num_replicas_per_region”: [
ms_testdb’# {“cloud”: “aws”, “region”: “ms-1”, “num_replicas”: 2},
ms_testdb’# {“cloud”: “aws”, “region”: “mr-1”, “num_replicas”: 2},
ms_testdb’# {“cloud”: “aws”, “region”: “st-1”, “num_replicas”: 1}
ms_testdb’# ]
ms_testdb’# }’
ms_testdb(# );
ERROR: Invalid type/value for some key in placement block. Placement policy: {
“num_replicas”: 5,
“placement_blocks”: [
{“cloud”: “aws”, “region”: “ms-1”, “zone”: “ms-1a”, “min_num_replicas”: 1},
{“cloud”: “aws”, “region”: “ms-1”, “zone”: “ms-1b”, “min_num_replicas”: 0},
{“cloud”: “aws”, “region”: “ms-1”, “zone”: “ms-1c”, “min_num_replicas”: 0},
{“cloud”: “aws”, “region”: “mr-1”, “zone”: “mr-1a”, “min_num_replicas”: 1},
{“cloud”: “aws”, “region”: “mr-1”, “zone”: “mr-1b”, “min_num_replicas”: 0},
{“cloud”: “aws”, “region”: “mr-1”, “zone”: “mr-1c”, “min_num_replicas”: 0},
{“cloud”: “aws”, “region”: “st-1”, “zone”: “st-1a”, “min_num_replicas”: 1},
{“cloud”: “aws”, “region”: “st-1”, “zone”: “st-1b”, “min_num_replicas”: 0},
{“cloud”: “aws”, “region”: “st-1”, “zone”: “st-1c”, “min_num_replicas”: 0}
],
“num_replicas_per_region”: [
{“cloud”: “aws”, “region”: “ms-1”, “num_replicas”: 2},
{“cloud”: “aws”, “region”: “mr-1”, “num_replicas”: 2},
{“cloud”: “aws”, “region”: “st-1”, “num_replicas”: 1}
]
}

Hi @ranjanp

Did you follow any guide on our docs regarding this?

Can you show a full page screenshot of http://yb-master:7000/tablet-servers ?

Also, what version are you using?

Regards,
Dorian

The issue is num_replicas_per_region is not a supported param in our tablespace declaration. This should be sufficient.

CREATE TABLESPACE multi_region_tablespace WITH (
  replica_placement='{"num_replicas": 5, "placement_blocks":
  [{"cloud":"aws","region":"ms-1","zone":"ms-1a","min_num_replicas":1},
   {"cloud":"aws","region":"ms-1","zone":"ms-1b","min_num_replicas":1},
   {"cloud":"aws","region":"mr-1","zone":"mr-1a","min_num_replicas":1},
   {"cloud":"aws","region":"mr-1","zone":"mr-1b","min_num_replicas":1},
   {"cloud":"aws","region":"st-1","zone":"st-1a","min_num_replicas":1}]}'
);

version 2.25.1.0

Thanks Dorian. I can create with min_num_replicas=1.

As a requirement i described above is it something you plan to add where Yugabyte takes care of distributing the tablets across AZs in a region. 3 AZs in this case and 2 replica copy. This way I can more nodes incase I want to scale up.

Hi @ranjanp -

This improvement is currently being tracked under this GitHub issue: GH# 26671. It is currently undergoing code review and further testing. Once it has been landed, I would expect it to be available in an upcoming drop of 2.25 in the next few weeks.

Thanks a lot, this helps.

And I hope YDB would choose the AZs evenly. As in this case 2 zones out of 3 AZs available and all the 3 nodes will have even data?

Hi @ranjanp

By adding this feature to create tablespace, the idea is for the tablespace to follow the zones where there are nodes in the cluster. The cloud=aws, region=ms-1, zone=* designation says that I want to match on any zone that has nodes present in the cluster.
The choice is actually not ours to make on the placement if you are the one designating where the nodes are logically or physically. If there is no node present in a given AZ, we cannot place any data there in full or in part.

Hope this clears things up.
–Alan