Tablets are not splitted evenly

Hi
I have 5 node cluster where tservers are running and automatic tablet split are on and num shards per tserver is 1.
I have a partitioned table, master table tablet leaders split evenly, however 1 child table tablet leader did not split evenly at first.
After deleting the child table and recreating that child table again, tablet leaders split evenly.

I could not understand why uneven split happened at first. Please help me to understand.

What do you mean by “uneven”?
The dynamic tablet splitting mechanism should pick a midpoint based on the data, so if you have a lot of data on the lower end of the spectrum, the split point will be lower.
Is the table hash partitioned or range partitioned?
When recreating the table, did you CREATE TABLE AS or data load everything immediately compared to before?

@jason Please find the tablets list below:
…/yugabyte-2.20.1.1# bin/yb-admin --master_addresses xx.xx.xx…214:7100,xx.xx.xx…215:7100,xx.xx.xx…216:7100 list_tablets ysql.mydb mytable_202402 include_followers
Tablet-UUID Range Leader-IP Leader-UUID Followers
258f5346bf8041adad65baafb2b6f44f partition_key_start: “” partition_key_end: “\010\010” xx.xx.xx.191:9100 0a964fb757a541979a67eca750b5f3e8 xx.xx.xx.194:9100,xx.xx.xx…192:9100
e0b32395a4f24c98be0b4575d60c7ffc partition_key_start: “\010\010” partition_key_end: “\017\225” xx.xx.xx.192:9100 1d10523685ba4fbb9874ea3369713828 xx.xx.xx.190:9100,xx.xx.xx…191:9100
f95943ab3f50462badb3b0292c71668a partition_key_start: “\017\225” partition_key_end: “\026\213” xx.xx.xx.192:9100 1d10523685ba4fbb9874ea3369713828 xx.xx.xx.194:9100,xx.xx.xx…190:9100
a3031ad01249401baa100b17e93c53bb partition_key_start: “\026\213” partition_key_end: “\034\210” xx.xx.xx.190:9100 4e65a702b3dd4b3d97621f53b4d16adf xx.xx.xx.192:9100,xx.xx.xx…194:9100
252a3b8f6a7c48baa8d95d97888a82b8 partition_key_start: “\034\210” partition_key_end: “"\353” xx.xx.xx.190:9100 4e65a702b3dd4b3d97621f53b4d16adf xx.xx.xx.191:9100,xx.xx.xx…192:9100
acaa3f1cc38d4474a09e1a4d62e07777 partition_key_start: “"\353” partition_key_end: “(\213” xx.xx.xx.194:9100 b089665ff27d47b288ddbdf8102c350e xx.xx.xx.190:9100,xx.xx.xx…192:9100
0644ecab35da468686f747b605c815f0 partition_key_start: “(\213” partition_key_end: “.\t” xx.xx.xx.192:9100 1d10523685ba4fbb9874ea3369713828 xx.xx.xx.190:9100,xx.xx.xx…191:9100
d5f812c8acc74fc38f2bd684083cdb34 partition_key_start: “.\t” partition_key_end: “33” xx.xx.xx.190:9100 4e65a702b3dd4b3d97621f53b4d16adf xx.xx.xx.194:9100,xx.xx.xx…191:9100
d35006f4da2f4b54a7b0638fb170460f partition_key_start: “33” partition_key_end: “:\335” xx.xx.xx.190:9100 4e65a702b3dd4b3d97621f53b4d16adf xx.xx.xx.194:9100,xx.xx.xx…191:9100
68969746437a41e0840ebf44bc2dceea partition_key_start: “:\335” partition_key_end: “A\344” xx.xx.xx.190:9100 4e65a702b3dd4b3d97621f53b4d16adf xx.xx.xx.194:9100,xx.xx.xx…191:9100

The 5 servers IPs are xx.xx.xx.190,xx.xx.xx.191,xx.xx.xx.192,xx.xx.xx.193,xx.xx.xx.194

This can be seen that, among the 5 tserver nodes, 193 is totally absent in the list. most tablet concentration is in 190 node. But when we dropped the table and created once again, the distribution was fine after that. After loading data tablet split were fine and the distribution is also proper.

this table is a child table of range partitioned table. This is created with Create table <child_table_name> partition of <main_partitioned_table from (<dt_start>) to (<dt_end>);

Streaming data is being loaded in those tables through Kafka consumer.

Our question here is,

  1. in what circumstances this may happen
  2. if this type of skewed distribution resurfaces in future(production), then how to make sure that it gets rebalanced properly again.

Hi dipanjanghos,

If you use range partitioned tables, the default number of shards/tablets would be 1 (on the 2.20.1 builds that you are using). As data is loaded into the tablet, it can get split. Automatic tablet splitting works at a per tablet level, and it splits the tablet into 2 - the split point is currently derived from the data that is flushed to disk and if there is additional data that is already inserted to the table (but not yet flushed etc), that data might not influence the split point. The net result is that it can sometimes result in uneven splitting of data. We are looking at couple of cases of uneven splits, especially if the input data comes in an ascending or descending order. Chances of uneven splits would be low, if the input data is shuffled. Also, If you know the split points, then you can use them at the table creation time as mentioned in thee example in - Tablet splitting | YugabyteDB Docs.

Are you concerned about traffic not being spread across the nodes in the above case (when the split was uneven?).

Hi @Raghavendra_Thallam ,
The problems we faced is:
In one occasion one of the tservers didn’t have any leader.
In another occasion though other things are correct, one of the tservers doesn’t have any of the tablets including indexes (report is share in the above post).

Yes. traffic distribution also skewed. Due to this uneven split the throughput is going down drastically. Where we are getting ~8k steady inserts per second (even when 40+ crores of data is there in the table) after even distribution took place on truncating the tables, we were struggling to reach 2.2 K ingestion per second in uneven distribution.

What I do concern about is the a way to re-uniform it. In the long run it can happen that one tablet server is down for long and the skewedness crops up with newly created tables. Then what would be the remedy to make every thing in line again.

Thanks & regards,
Dipanjan

@dipanjanghos :

Do you have any logs from yb-master when 1 t-server did not have any leaders or tablets ? I’d be interested in seeing if the Load Balancer was trying to balance things out to ensure an even distribution of tablets.

Could you also share the output of http://:7000/tablet-servers and http://:7000/cluster-config